├── README.md ├── data.zip ├── graphdownstream ├── GIN16.out ├── GIN64.out ├── GIN8.out ├── GIN8dim.out ├── aug.py ├── basemodel.py ├── dataset.py ├── dgi.py ├── dgi │ ├── dgi.py │ └── layers │ │ ├── __init__.py │ │ ├── __pycache__ │ │ ├── __init__.cpython-310.pyc │ │ ├── discriminator.cpython-310.pyc │ │ ├── gcn.cpython-310.pyc │ │ └── readout.cpython-310.pyc │ │ ├── discriminator.py │ │ ├── gcn.py │ │ ├── gin.py │ │ └── readout.py ├── embedding.py ├── epoch_loss_enzymes1.png ├── filternet.py ├── gae_pretrain.py ├── gat.py ├── gcl.py ├── gcn.py ├── gin4dim.out ├── gin_downstream.py ├── gin_local.py ├── graph_finetuning_cross_val.py ├── graph_finetuning_layer.py ├── graph_prompt_cross_val.py ├── graph_prompt_layer.py ├── graphcl-pretrain │ ├── aug.py │ ├── dgi.py │ ├── execute.py │ └── layers │ │ ├── __init__.py │ │ ├── __pycache__ │ │ ├── __init__.cpython-36.pyc │ │ ├── discriminator.cpython-36.pyc │ │ ├── discriminator2.cpython-36.pyc │ │ ├── gcn.cpython-36.pyc │ │ └── readout.cpython-36.pyc │ │ ├── discriminator.py │ │ ├── discriminator2.py │ │ ├── gcn.py │ │ └── readout.py ├── graphsage.py ├── layers │ ├── __init__.py │ ├── __pycache__ │ │ ├── __init__.cpython-36.pyc │ │ ├── discriminator.cpython-36.pyc │ │ ├── discriminator2.cpython-36.pyc │ │ ├── gcn.cpython-36.pyc │ │ └── readout.cpython-36.pyc │ ├── discriminator.py │ ├── discriminator2.py │ ├── gcn.py │ └── readout.py ├── linear_tuning_fewshot.py └── model_weight.py ├── nodedownstream ├── ENZYMES2ONE_Graph.py ├── aug.py ├── basemodel.py ├── dataset.py ├── datasetInfo.py ├── dataset_flickr.py ├── dgi.py ├── dgi_f.py ├── embedding.py ├── filternet.py ├── flickr_mix_GP.py ├── flickr_mix_p4_DGI.py ├── flickr_mix_p4_GCL.py ├── flikcrtaskchoose.py ├── gat.py ├── gcl.py ├── gcn.py ├── gin_downstream.py ├── gin_flickr_DGI.py ├── gin_local.py ├── graph_finetuning_cross_val.py ├── graph_finetuning_layer.py ├── graph_prompt_cross_val.py ├── graph_prompt_layer.py ├── graphsage.py ├── layers │ ├── __init__.py │ ├── __pycache__ │ │ ├── __init__.cpython-36.pyc │ │ ├── discriminator.cpython-36.pyc │ │ ├── discriminator2.cpython-36.pyc │ │ ├── gcn.cpython-36.pyc │ │ └── readout.cpython-36.pyc │ ├── discriminator.py │ ├── discriminator2.py │ ├── gcn.py │ └── readout.py ├── linear_tuning_fewshot.py ├── lp_pretrain.py ├── model_weight.py ├── model_weight_fix.py ├── no_pretrain_no_tuning_cross_val.py ├── node_finetuning_layer.py ├── node_prompt_layer.py └── predictnet.py └── requirements.txt /README.md: -------------------------------------------------------------------------------- 1 | We provide the code (in pytorch) and datasets for our paper [**"Generalized Graph Prompt: Toward a Unification of Pre-Training and Downstream Tasks on Graphs"**](https://arxiv.org/pdf/2302.08043.pdf). This is an extension of [**"GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural Networks"**](https://dl.acm.org/doi/pdf/10.1145/3543507.3583386), accepted by the ACM Web Conference (WWW) 2023. 2 | 3 | ## Description 4 | - **data/**: contains data we use. 5 | - **graphdownstream/**: implements pre-training and downstream tasks at the graph level. 6 | - **nodedownstream/**: implements downstream tasks at the node level. 7 | 8 | 9 | ## Package Dependencies 10 | 11 | 1. 3.6.0<= python <=3.8.0 12 | 2. pip install -r requirements.txt 13 | 14 | ## Getting Started 15 | ### Graph Classification 16 | 17 | Default dataset is ENZYMES. You need to change the corresponding parameters in *pre_train.py* and *prompt_fewshot.py* to train and evaluate on other datasets. 18 | 19 | Pretrain: 20 | ```sh 21 | -python pre_train_GP.py --model GIN --gpu_id 0 --gcn_hidden_dim 32 --temperature 0.2 --batch_size 1024 --pretrain_hop_num 0 --lr 0.1 --epochs 400 --dropout 0 --seed 0 --max_ngv 126 --max_nge 298 --max_ngvl 7 --max_ngel 2 --node_feature_dim 18 --graph_label_num 6 --graph_dir ../data/ENZYMES/raw --graphslabel_dir ../data/ENZYMES/ENZYMES_graph_labels.txt --save_data_dir ../data/ENZYMESPreTrain --save_model_dir ../dumps/ENZYMESPreTrain/GIN --share_emb False --predict_net_add_enc True --predict_net_add_degree True 22 | ``` 23 | Prompt tune and test: 24 | 25 | ```sh 26 | python prompt_fewshot_GP.py --pretrain_model GIN --gpu_id 0 --reg_loss NLL --bp_loss NLL --prompt FEATURE-WEIGHTED-SUM --epochs 100 --lr 0.01 --update_pretrain False --seed 0 --dropout 0 --dataset_seed 0 --train_shotnum 5 --val_shotnum 5 few_shot_tasknum 100 --gcn_graph_num_layers 3 --gcn_hidden_dim 32 --graph_finetuning_output_dim 2 --batch_size 512 --max_ngv 126 --max_nge 298 --max_ngvl 7 --max_ngel 2 --node_feature_dim 18 --graph_label_num 6 --graph_dir ../data/ENZYMES/raw --graphslabel_dir ../data/ENZYMES/ENZYMES_graph_labels.txt --save_data_dir ../data/ENZYMESPreTrain --save_pretrain_model_dir ../dumps/ENZYMESPreTrain/GIN --downstream_save_model_dir ../dumps/ENZYMESGraphClassification/Prompt/GIN-FEATURE-WEIGHTED-SUM/5train5val100task --save_fewshot_dir ../data/ENZYMESGraphClassification/fewshot --share_emb False --predict_net_add_enc True --predict_net_add_degree True 27 | ``` 28 | 29 | 30 | ### Node Classification 31 | 32 | Default dataset is ENZYMES. You need to change the corresponding parameters in *prompt_fewshot.py* to train and evaluate on other datasets. 33 | ```sh 34 | python run_mix_GP.py --pretrain_model GIN --gpu_id 0 --reg_loss NLL --bp_loss NLL --prompt FEATURE-WEIGHTED-SUM --epochs 100 --lr 0.1 --update_pretrain False --seed 0 --dropout 0 --dataset_seed 0 --train_shotnum 1 --val_shotnum 1 few_shot_tasknum 10 --nhop_neighbour 1 --gcn_graph_num_layers 3 --gcn_hidden_dim 32 --prompt_output_dim 2 --batch_size 1024 --max_ngv 126 --max_nge 282 --max_ngvl 3 --max_ngel 2 --node_feature_dim 18 --graph_label_num 6 --graph_num 53 --graph_dir ../data/ENZYMES/allraw --save_data_dir ../data/ENZYMES/all --save_pretrain_model_dir ../dumps/ENZYMESPreTrain/GIN --downstream_save_model_dir ../dumps/ENZYMESNodeClassification/Prompt/GIN-FEATURE-WEIGHTED-SUM/all/1train1val10task --save_fewshot_dir ../data/ENZYMES/nodefewshot --process_raw False --split False 35 | ``` 36 | -------------------------------------------------------------------------------- /data.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/data.zip -------------------------------------------------------------------------------- /graphdownstream/GIN64.out: -------------------------------------------------------------------------------- 1 | Load Few Shot 2 | -------------------------------------------------------------------------------------- 3 | start task 0 4 | Traceback (most recent call last): 5 | File "prompt_fewshot.py", line 569, in 6 | pre_train_model.load_state_dict(torch.load(os.path.join(save_pretrain_model_dir, 'best.pt'))) 7 | File "/home/xingtong/anaconda3/envs/dgl/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1483, in load_state_dict 8 | self.__class__.__name__, "\n\t".join(error_msgs))) 9 | RuntimeError: Error(s) in loading state_dict for GIN: 10 | size mismatch for convs.0.apply_func.0.weight: copying a param with shape torch.Size([64, 18]) from checkpoint, the shape in current model is torch.Size([16, 64]). 11 | size mismatch for convs.0.apply_func.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 12 | size mismatch for convs.0.apply_func.2.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 13 | size mismatch for convs.0.apply_func.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 14 | size mismatch for convs.1.apply_func.0.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 15 | size mismatch for convs.1.apply_func.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 16 | size mismatch for convs.1.apply_func.2.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 17 | size mismatch for convs.1.apply_func.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 18 | size mismatch for convs.2.apply_func.0.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 19 | size mismatch for convs.2.apply_func.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 20 | size mismatch for convs.2.apply_func.2.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 21 | size mismatch for convs.2.apply_func.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 22 | size mismatch for bns.0.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 23 | size mismatch for bns.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 24 | size mismatch for bns.0.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 25 | size mismatch for bns.0.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 26 | size mismatch for bns.1.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 27 | size mismatch for bns.1.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 28 | size mismatch for bns.1.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 29 | size mismatch for bns.1.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 30 | size mismatch for bns.2.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 31 | size mismatch for bns.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 32 | size mismatch for bns.2.running_mean: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 33 | size mismatch for bns.2.running_var: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 34 | size mismatch for g_net.0.apply_func.0.weight: copying a param with shape torch.Size([64, 18]) from checkpoint, the shape in current model is torch.Size([16, 64]). 35 | size mismatch for g_net.0.apply_func.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 36 | size mismatch for g_net.0.apply_func.2.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 37 | size mismatch for g_net.0.apply_func.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 38 | size mismatch for g_net.1.apply_func.0.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 39 | size mismatch for g_net.1.apply_func.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 40 | size mismatch for g_net.1.apply_func.2.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 41 | size mismatch for g_net.1.apply_func.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 42 | size mismatch for g_net.2.apply_func.0.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 43 | size mismatch for g_net.2.apply_func.0.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 44 | size mismatch for g_net.2.apply_func.2.weight: copying a param with shape torch.Size([64, 64]) from checkpoint, the shape in current model is torch.Size([16, 16]). 45 | size mismatch for g_net.2.apply_func.2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([16]). 46 | -------------------------------------------------------------------------------- /graphdownstream/aug.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import copy 3 | import random 4 | import pdb 5 | import scipy.sparse as sp 6 | import numpy as np 7 | 8 | def main(): 9 | pass 10 | 11 | 12 | def aug_random_mask(input_feature, drop_percent=0.2): 13 | 14 | node_num = input_feature.shape[1] 15 | mask_num = int(node_num * drop_percent) 16 | node_idx = [i for i in range(node_num)] 17 | mask_idx = random.sample(node_idx, mask_num) 18 | aug_feature = copy.deepcopy(input_feature) 19 | zeros = torch.zeros_like(aug_feature[0][0]) 20 | for j in mask_idx: 21 | aug_feature[0][j] = zeros 22 | return aug_feature 23 | 24 | 25 | def aug_random_edge(input_adj, drop_percent=0.2): 26 | 27 | percent = drop_percent / 2 28 | row_idx, col_idx = input_adj.nonzero() 29 | 30 | index_list = [] 31 | for i in range(len(row_idx)): 32 | index_list.append((row_idx[i], col_idx[i])) 33 | 34 | single_index_list = [] 35 | for i in list(index_list): 36 | single_index_list.append(i) 37 | index_list.remove((i[1], i[0])) 38 | 39 | 40 | edge_num = int(len(row_idx) / 2) # 9228 / 2 41 | add_drop_num = int(edge_num * percent / 2) 42 | aug_adj = copy.deepcopy(input_adj.todense().tolist()) 43 | 44 | edge_idx = [i for i in range(edge_num)] 45 | drop_idx = random.sample(edge_idx, add_drop_num) 46 | 47 | 48 | for i in drop_idx: 49 | aug_adj[single_index_list[i][0]][single_index_list[i][1]] = 0 50 | aug_adj[single_index_list[i][1]][single_index_list[i][0]] = 0 51 | 52 | ''' 53 | above finish drop edges 54 | ''' 55 | node_num = input_adj.shape[0] 56 | l = [(i, j) for i in range(node_num) for j in range(i)] 57 | add_list = random.sample(l, add_drop_num) 58 | 59 | for i in add_list: 60 | 61 | aug_adj[i[0]][i[1]] = 1 62 | aug_adj[i[1]][i[0]] = 1 63 | 64 | aug_adj = np.matrix(aug_adj) 65 | aug_adj = sp.csr_matrix(aug_adj) 66 | return aug_adj 67 | 68 | 69 | def aug_drop_node(input_fea, input_adj, drop_percent=0.2): 70 | 71 | input_adj = torch.tensor(input_adj.todense().tolist()) 72 | input_fea = input_fea.squeeze(0) 73 | 74 | node_num = input_fea.shape[0] 75 | drop_num = int(node_num * drop_percent) # number of drop nodes 76 | all_node_list = [i for i in range(node_num)] 77 | 78 | drop_node_list = sorted(random.sample(all_node_list, drop_num)) 79 | 80 | aug_input_fea = delete_row_col(input_fea, drop_node_list, only_row=True) 81 | aug_input_adj = delete_row_col(input_adj, drop_node_list) 82 | 83 | aug_input_fea = aug_input_fea.unsqueeze(0) 84 | aug_input_adj = sp.csr_matrix(np.matrix(aug_input_adj)) 85 | 86 | return aug_input_fea, aug_input_adj 87 | 88 | def aug_subgraph_CL(graph, drop_percent=0.2): 89 | # input_adj = graph.adjacency_matrix().to_dense() 90 | # input_fea = input_fea.squeeze(0) 91 | edge_num = graph.batch_num_edges().tolist() 92 | # all_edge_list = [i for i in range(edge_num)] 93 | # s_node_num = int(edge_num * (1 - drop_percent)) 94 | # center_node_id = random.randint(0, node_num - 1) 95 | # sub_node_id_list = [center_node_id] 96 | # all_neighbor_list = [] 97 | # for i in range(s_node_num - 1): 98 | 99 | # all_neighbor_list += torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False).squeeze(1).tolist() 100 | # # print(torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False)) 101 | # all_neighbor_list = list(set(all_neighbor_list)) 102 | # new_neighbor_list = [n for n in all_neighbor_list if not n in sub_node_id_list] 103 | # if len(new_neighbor_list) != 0: 104 | # new_node = random.sample(new_neighbor_list, 1)[0] 105 | # sub_node_id_list.append(new_node) 106 | # else: 107 | # break 108 | 109 | 110 | # print("hhhhhhh") 111 | # print(drop_node_list) 112 | # a = graph_len.squeeze(1).tolist() 113 | sub_edge_id_list = [] 114 | tag = 0 115 | for i in range(len(edge_num)): 116 | s_edge_num = int(edge_num[i] * drop_percent) 117 | temp = random.sample(range(0,edge_num[i]),s_edge_num) 118 | sub_edge_id_list += [(x+tag) for x in temp] 119 | tag+=edge_num[i] 120 | # edge_num[i] = edge_num[i]-s_node_num 121 | 122 | drop_edge_list = sub_edge_id_list 123 | 124 | # a = torch.IntTensor(a).unsqueeze(1) 125 | 126 | 127 | 128 | graph.remove_edges(drop_edge_list) 129 | 130 | # return graph.subgraph(sub_node_id_list),a 131 | return graph 132 | 133 | return graph,graph_len 134 | def aug_subgraph(graph,graph_len, drop_percent=0.2): 135 | 136 | # input_adj = graph.adjacency_matrix().to_dense() 137 | # input_fea = input_fea.squeeze(0) 138 | node_num = graph.ndata['feature'].shape[0] 139 | # all_node_list = [i for i in range(node_num)] 140 | # s_node_num = int(node_num * (1 - drop_percent)) 141 | # center_node_id = random.randint(0, node_num - 1) 142 | # sub_node_id_list = [center_node_id] 143 | # all_neighbor_list = [] 144 | # for i in range(s_node_num - 1): 145 | 146 | # all_neighbor_list += torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False).squeeze(1).tolist() 147 | # # print(torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False)) 148 | # all_neighbor_list = list(set(all_neighbor_list)) 149 | # new_neighbor_list = [n for n in all_neighbor_list if not n in sub_node_id_list] 150 | # if len(new_neighbor_list) != 0: 151 | # new_node = random.sample(new_neighbor_list, 1)[0] 152 | # sub_node_id_list.append(new_node) 153 | # else: 154 | # break 155 | 156 | 157 | # print("hhhhhhh") 158 | # print(drop_node_list) 159 | a = graph_len.squeeze(1).tolist() 160 | sub_node_id_list = [] 161 | tag = 0 162 | for i in range(len(a)): 163 | s_node_num = int(a[i] * drop_percent) 164 | temp = random.sample(range(0,a[i]),s_node_num) 165 | sub_node_id_list += [(x+tag) for x in temp] 166 | tag+=a[i] 167 | a[i] = a[i]-s_node_num 168 | 169 | drop_node_list = sub_node_id_list 170 | 171 | a = torch.IntTensor(a).unsqueeze(1) 172 | 173 | 174 | 175 | graph.remove_nodes(drop_node_list) 176 | 177 | # return graph.subgraph(sub_node_id_list),a 178 | return graph,a 179 | 180 | 181 | 182 | 183 | 184 | def delete_row_col(input_matrix, drop_list, only_row=False): 185 | 186 | remain_list = [i for i in range(input_matrix.shape[0]) if i not in drop_list] 187 | out = input_matrix[remain_list, :] 188 | if only_row: 189 | return out 190 | out = out[:, remain_list] 191 | 192 | return out 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | if __name__ == "__main__": 213 | main() 214 | 215 | -------------------------------------------------------------------------------- /graphdownstream/dgi.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import AvgReadout, Discriminator, Discriminator2 4 | import pdb 5 | from gin_local import GIN 6 | from utils import split_and_batchify_graph_feats 7 | class DGI(nn.Module): 8 | def __init__(self, n_in, n_h,config): 9 | super(DGI, self).__init__() 10 | self.read = AvgReadout() 11 | self.sigm = nn.Sigmoid() 12 | self.disc = Discriminator(n_h*3) 13 | # self.disc2 = Discriminator2(n_h) 14 | self.gin = GIN(config) 15 | self.config = config 16 | 17 | 18 | def forward(self, graph,graph_shuf, graph_1, graph_2, graph_len, graph_len_1, graph_len_2, sparse, msk, samp_bias1, samp_bias2, aug_type): 19 | 20 | c_0,h_0 = self.gin(graph, graph_len) 21 | 22 | if aug_type == 'edge': 23 | 24 | # h_1 = self.gcn(seq1, aug_adj1, sparse) 25 | # h_3 = self.gcn(seq1, aug_adj2, sparse) 26 | pass 27 | 28 | elif aug_type == 'mask': 29 | 30 | # h_1 = self.gcn(seq3, adj, sparse) 31 | # h_3 = self.gcn(seq4, adj, sparse) 32 | pass 33 | 34 | elif aug_type == 'node' or aug_type == 'subgraph': 35 | 36 | c_1,h_1 = self.gin(graph_1, graph_len_1) 37 | c_3,h_3 = self.gin(graph_2, graph_len_2) 38 | 39 | else: 40 | assert False 41 | 42 | 43 | c_2,h_2 = self.gin(graph_shuf,graph_len) 44 | h_0,h_2 = self.sigm(h_0),self.sigm(h_2) 45 | 46 | 47 | # len_1 = int(h_0.shape[1]) 48 | # len_2 = int(h_2.shape[1]) 49 | # ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) 50 | # ret2 = self.disc(c_3, h_0, h_2, samp_bias1, samp_bias2) 51 | device = self.config["gpu_id"] 52 | # ret = ret1 + ret2 53 | # return ret,int(h_0.shape[1]),int(h_2.shape[1]) 54 | graph.add_self_loop() 55 | graph_1.add_self_loop() 56 | graph_2.add_self_loop() 57 | adj = graph.adjacency_matrix() 58 | adj_1 = graph_1.adjacency_matrix() 59 | adj_2 = graph_2.adjacency_matrix() 60 | adj = adj.to(device) 61 | adj_1 = adj_1.to(device) 62 | adj_2 = adj_2.to(device) 63 | # for count in range(self.config["pretrain_hop_num"]): 64 | # h_0 = torch.matmul(adj, h_0) 65 | # # h_1 = torch.matmul(adj_1, h_1) 66 | # h_2 = torch.matmul(adj, h_2) 67 | # # h_3 = torch.matmul(adj_2, h_3) 68 | return h_0,c_1,h_2,c_3 69 | 70 | # Detach the return variables 71 | def embed(self, graph, graph_len): 72 | return self.gin(graph,graph_len) 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /graphdownstream/dgi/dgi.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import GCN, AvgReadout, Discriminator 4 | 5 | class DGI(nn.Module): 6 | def __init__(self, n_in, n_h, activation): 7 | super(DGI, self).__init__() 8 | self.gcn = GCN(n_in, n_h, activation) 9 | self.read = AvgReadout() 10 | 11 | self.sigm = nn.Sigmoid() 12 | 13 | self.disc = Discriminator(n_h) 14 | 15 | def forward(self, seq1, seq2, adj, sparse, msk, samp_bias1, samp_bias2): 16 | h_1 = self.gcn(seq1, adj, sparse) 17 | 18 | c = self.read(h_1, msk) 19 | c = self.sigm(c) 20 | 21 | h_2 = self.gcn(seq2, adj, sparse) 22 | 23 | ret = self.disc(c, h_1, h_2, samp_bias1, samp_bias2) 24 | 25 | return ret 26 | 27 | # Detach the return variables 28 | def embed(self, seq, adj, sparse, msk): 29 | h_1 = self.gcn(seq, adj, sparse) 30 | c = self.read(h_1, msk) 31 | 32 | return h_1.detach(), c.detach() 33 | 34 | -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .gcn import GCN 2 | from .readout import AvgReadout 3 | from .discriminator import Discriminator 4 | -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/__pycache__/__init__.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/dgi/layers/__pycache__/__init__.cpython-310.pyc -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/__pycache__/discriminator.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/dgi/layers/__pycache__/discriminator.cpython-310.pyc -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/__pycache__/gcn.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/dgi/layers/__pycache__/gcn.cpython-310.pyc -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/__pycache__/readout.cpython-310.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/dgi/layers/__pycache__/readout.cpython-310.pyc -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/discriminator.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | c_x = torch.unsqueeze(c, 1) 20 | c_x = c_x.expand_as(h_pl) 21 | 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/gcn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class GCN(nn.Module): 5 | def __init__(self, in_ft, out_ft, act, bias=True): 6 | super(GCN, self).__init__() 7 | self.fc = nn.Linear(in_ft, out_ft, bias=False) 8 | self.act = nn.PReLU() if act == 'prelu' else act 9 | 10 | if bias: 11 | self.bias = nn.Parameter(torch.FloatTensor(out_ft)) 12 | self.bias.data.fill_(0.0) 13 | else: 14 | self.register_parameter('bias', None) 15 | 16 | for m in self.modules(): 17 | self.weights_init(m) 18 | 19 | def weights_init(self, m): 20 | if isinstance(m, nn.Linear): 21 | torch.nn.init.xavier_uniform_(m.weight.data) 22 | if m.bias is not None: 23 | m.bias.data.fill_(0.0) 24 | 25 | # Shape of seq: (batch, nodes, features) 26 | def forward(self, seq, adj, sparse=False): 27 | seq_fts = self.fc(seq) 28 | if sparse: 29 | out = torch.unsqueeze(torch.spmm(adj, torch.squeeze(seq_fts, 0)), 0) 30 | else: 31 | out = torch.bmm(adj, seq_fts) 32 | if self.bias is not None: 33 | out += self.bias 34 | 35 | return self.act(out) 36 | 37 | -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/gin.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GIN(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GIN, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, self.bns, g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | self.dropout=torch.nn.Dropout(p=config["dropout"]) 26 | 27 | # create predict layersr 28 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 29 | 30 | def create_net(self, name, input_dim, **kw): 31 | num_layers = kw.get("num_layers", 1) 32 | hidden_dim = kw.get("hidden_dim", 64) 33 | num_rels = kw.get("num_rels", 1) 34 | num_bases = kw.get("num_bases", 8) 35 | regularizer = kw.get("regularizer", "basis") 36 | dropout = kw.get("dropout", 0.5) 37 | 38 | 39 | self.convs = torch.nn.ModuleList() 40 | self.bns = torch.nn.ModuleList() 41 | 42 | for i in range(num_layers): 43 | 44 | if i: 45 | nn = torch.nn.Sequential(torch.nn.Linear(hidden_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 46 | else: 47 | nn = torch.nn.Sequential(torch.nn.Linear(input_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 48 | conv = dgl.nn.pytorch.conv.GINConv(apply_func=nn,aggregator_type='sum') 49 | bn = torch.nn.BatchNorm1d(hidden_dim) 50 | 51 | self.convs.append(conv) 52 | self.bns.append(bn) 53 | 54 | return self.convs, self.bns, hidden_dim 55 | 56 | 57 | #def forward(self, pattern, pattern_len, graph, graph_len): 58 | def forward(self, graph, graph_len): 59 | graph_output = graph.ndata["feature"] 60 | xs = [] 61 | for i in range(self.num_layers_num): 62 | graph_output = F.relu(self.convs[i](graph,graph_output)) 63 | graph_output = self.bns[i](graph_output) 64 | graph_output = self.dropout(graph_output) 65 | xs.append(graph_output) 66 | xpool= [] 67 | for x in xs: 68 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 69 | graph_embedding = torch.sum(graph_embedding, dim=1) 70 | xpool.append(graph_embedding) 71 | x = torch.cat(xpool, -1) 72 | #x is graph level embedding; xs is node level embedding 73 | return x,torch.cat(xs, -1) 74 | -------------------------------------------------------------------------------- /graphdownstream/dgi/layers/readout.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | # Applies an average on seq, of shape (batch, nodes, features) 5 | # While taking into account the masking of msk 6 | class AvgReadout(nn.Module): 7 | def __init__(self): 8 | super(AvgReadout, self).__init__() 9 | 10 | def forward(self, seq, msk): 11 | if msk is None: 12 | return torch.mean(seq, 1) 13 | else: 14 | msk = torch.unsqueeze(msk, -1) 15 | return torch.sum(seq * msk, 1) / torch.sum(msk) 16 | 17 | -------------------------------------------------------------------------------- /graphdownstream/embedding.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from utils import extend_dimensions 5 | 6 | 7 | class NormalEmbedding(nn.Module): 8 | def __init__(self, input_dim, emb_dim): 9 | super(NormalEmbedding, self).__init__() 10 | self.input_dim = input_dim 11 | self.emb_dim = emb_dim 12 | self.emb_layer = nn.Linear(input_dim, emb_dim, bias=False) 13 | 14 | # init 15 | nn.init.normal_(self.emb_layer.weight, 0.0, 1.0) 16 | 17 | def increase_input_size(self, new_input_dim): 18 | assert new_input_dim >= self.input_dim 19 | if new_input_dim != self.input_dim: 20 | new_emb_layer = extend_dimensions(self.emb_layer, new_input_dim=new_input_dim, upper=False) 21 | del self.emb_layer 22 | self.emb_layer = new_emb_layer 23 | self.input_dim = new_input_dim 24 | 25 | def forward(self, x): 26 | emb = self.emb_layer(x) 27 | return emb 28 | 29 | class OrthogonalEmbedding(nn.Module): 30 | def __init__(self, input_dim, emb_dim): 31 | super(OrthogonalEmbedding, self).__init__() 32 | self.input_dim = input_dim 33 | self.emb_dim = emb_dim 34 | self.emb_layer = nn.Linear(input_dim, emb_dim, bias=False) 35 | 36 | # init 37 | nn.init.orthogonal_(self.emb_layer.weight) 38 | 39 | def increase_input_size(self, new_input_dim): 40 | assert new_input_dim >= self.input_dim 41 | if new_input_dim != self.input_dim: 42 | new_emb_layer = extend_dimensions(self.emb_layer, new_input_dim=new_input_dim, upper=False) 43 | del self.emb_layer 44 | self.emb_layer = new_emb_layer 45 | self.input_dim = new_input_dim 46 | 47 | def forward(self, x): 48 | emb = self.emb_layer(x) 49 | return emb 50 | 51 | class EquivariantEmbedding(nn.Module): 52 | def __init__(self, input_dim, emb_dim): 53 | super(EquivariantEmbedding, self).__init__() 54 | self.input_dim = input_dim 55 | self.emb_dim = emb_dim 56 | self.emb_layer = nn.Linear(input_dim, emb_dim, bias=False) 57 | 58 | # init 59 | nn.init.normal_(self.emb_layer.weight[:,0], 0.0, 1.0) 60 | emb_column = self.emb_layer.weight[:,0] 61 | with torch.no_grad(): 62 | for i in range(1, self.input_dim): 63 | self.emb_layer.weight[:,i].data.copy_(torch.roll(emb_column, i, 0)) 64 | 65 | def increase_input_size(self, new_input_dim): 66 | assert new_input_dim >= self.input_dim 67 | if new_input_dim != self.input_dim: 68 | new_emb_layer = extend_dimensions(self.emb_layer, new_input_dim=new_input_dim, upper=False) 69 | del self.emb_layer 70 | self.emb_layer = new_emb_layer 71 | self.input_dim = new_input_dim 72 | 73 | def forward(self, x): 74 | emb = self.emb_layer(x) 75 | return emb -------------------------------------------------------------------------------- /graphdownstream/epoch_loss_enzymes1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/epoch_loss_enzymes1.png -------------------------------------------------------------------------------- /graphdownstream/filternet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | # class MaxGatedFilterNet(nn.Module): 6 | # def __init__(self, pattern_dim, graph_dim): 7 | # super(MaxGatedFilterNet, self).__init__() 8 | # self.g_layer = nn.Linear(graph_dim, pattern_dim) 9 | # self.f_layer = nn.Linear(pattern_dim, 1) 10 | 11 | # # init 12 | # scale = (1/pattern_dim)**0.5 13 | # nn.init.normal_(self.g_layer.weight, 0.0, scale) 14 | # nn.init.zeros_(self.g_layer.bias) 15 | # nn.init.normal_(self.f_layer.weight, 0.0, scale) 16 | # nn.init.ones_(self.f_layer.bias) 17 | 18 | # def forward(self, p_x, g_x): 19 | # max_x = torch.max(p_x, dim=1, keepdim=True)[0].float() 20 | # g_x = self.g_layer(g_x.float()) 21 | # f = self.f_layer(g_x * max_x) 22 | # return F.sigmoid(f) 23 | 24 | class MaxGatedFilterNet(nn.Module): 25 | def __init__(self): 26 | super(MaxGatedFilterNet, self).__init__() 27 | 28 | def forward(self, p_x, g_x): 29 | max_x = torch.max(p_x, dim=1, keepdim=True)[0] 30 | if max_x.dim() == 2: 31 | return g_x <= max_x 32 | else: 33 | return (g_x <= max_x).all(keepdim=True, dim=2) 34 | 35 | 36 | -------------------------------------------------------------------------------- /graphdownstream/gat.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GAT(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GAT, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, self.g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | 26 | # create predict layersr 27 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 28 | 29 | def create_net(self, name, input_dim, **kw): 30 | num_layers = kw.get("num_layers", 1) 31 | hidden_dim = kw.get("hidden_dim", 64) 32 | num_rels = kw.get("num_rels", 1) 33 | num_bases = kw.get("num_bases", 8) 34 | regularizer = kw.get("regularizer", "basis") 35 | dropout = kw.get("dropout", 0.5) 36 | 37 | 38 | self.convs = torch.nn.ModuleList() 39 | 40 | gat1=dgl.nn.pytorch.conv.GATConv(in_feats=input_dim, out_feats=hidden_dim,num_heads=4,allow_zero_in_degree=True) 41 | gat2=dgl.nn.pytorch.conv.GATConv(in_feats=4*hidden_dim, out_feats=hidden_dim,num_heads=1,allow_zero_in_degree=True) 42 | 43 | self.convs.append(gat1) 44 | self.convs.append(gat2) 45 | 46 | return self.convs, hidden_dim 47 | 48 | 49 | #def forward(self, pattern, pattern_len, graph, graph_len): 50 | def forward(self, graph, graph_len): 51 | #bsz = pattern_len.size(0) 52 | # filter_gate选出了graph中与同构无关的节点的mask 53 | #gate = self.get_filter_gate(pattern, pattern_len, graph, graph_len) 54 | graph_output = graph.ndata["feature"] 55 | #xs = [] 56 | graph_output=F.relu(self.convs[0](graph,graph_output)) 57 | graph_output=graph_output.resize(graph_output.size(0),1,graph_output.size(1)*graph_output.size(2)).squeeze() 58 | graph_output=F.relu(self.convs[1](graph,graph_output)).squeeze() 59 | # for i in range(self.num_layers_num): 60 | # graph_output = F.relu(self.convs[i](graph,graph_output)) 61 | # xs.append(graph_output) 62 | #xpool= [] 63 | graph_embedding = split_and_batchify_graph_feats(graph_output, graph_len)[0] 64 | graph_embedding = torch.sum(graph_embedding, dim=1) 65 | return graph_embedding,graph_output 66 | # for x in xs: 67 | # graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 68 | # graph_embedding = torch.sum(graph_embedding, dim=1) 69 | # xpool.append(graph_embedding) 70 | # x = torch.cat(xpool, -1) 71 | # return x,torch.cat(xs, -1) 72 | -------------------------------------------------------------------------------- /graphdownstream/gcl.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import AvgReadout, Discriminator, Discriminator2 4 | import pdb 5 | from gin_local import GIN 6 | from utils import split_and_batchify_graph_feats 7 | class GCL(nn.Module): 8 | def __init__(self, n_in, n_h,config): 9 | super(GCL, self).__init__() 10 | self.read = AvgReadout() 11 | self.sigm = nn.Sigmoid() 12 | self.disc = Discriminator(n_h*3) 13 | # self.disc2 = Discriminator2(n_h) 14 | self.gin = GIN(config) 15 | self.config = config 16 | 17 | 18 | def forward(self, graph,graph_shuf, graph_1, graph_2, graph_len, graph_len_1, graph_len_2, sparse, msk, samp_bias1, samp_bias2, aug_type): 19 | 20 | c_0,h_0 = self.gin(graph, graph_len) 21 | 22 | if aug_type == 'edge': 23 | 24 | # h_1 = self.gcn(seq1, aug_adj1, sparse) 25 | # h_3 = self.gcn(seq1, aug_adj2, sparse) 26 | pass 27 | 28 | elif aug_type == 'mask': 29 | 30 | # h_1 = self.gcn(seq3, adj, sparse) 31 | # h_3 = self.gcn(seq4, adj, sparse) 32 | pass 33 | 34 | elif aug_type == 'node' or aug_type == 'subgraph': 35 | 36 | c_1,h_1 = self.gin(graph_1, graph_len_1) 37 | c_3,h_3 = self.gin(graph_2, graph_len_2) 38 | 39 | else: 40 | assert False 41 | 42 | 43 | c_2,h_2 = self.gin(graph_shuf,graph_len) 44 | h_0,h_1,h_2,h_3 = self.sigm(h_0),self.sigm(h_1),self.sigm(h_2),self.sigm(h_3) 45 | 46 | 47 | # len_1 = int(h_0.shape[1]) 48 | # len_2 = int(h_2.shape[1]) 49 | # ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) 50 | # ret2 = self.disc(c_3, h_0, h_2, samp_bias1, samp_bias2) 51 | device = self.config["gpu_id"] 52 | # ret = ret1 + ret2 53 | # return ret,int(h_0.shape[1]),int(h_2.shape[1]) 54 | graph.add_self_loop() 55 | graph_1.add_self_loop() 56 | graph_2.add_self_loop() 57 | adj = graph.adjacency_matrix() 58 | adj_1 = graph_1.adjacency_matrix() 59 | adj_2 = graph_2.adjacency_matrix() 60 | adj = adj.to(device) 61 | adj_1 = adj_1.to(device) 62 | adj_2 = adj_2.to(device) 63 | # for count in range(self.config["pretrain_hop_num"]): 64 | # h_0 = torch.matmul(adj, h_0) 65 | # h_1 = torch.matmul(adj_1, h_1) 66 | # h_2 = torch.matmul(adj, h_2) 67 | # h_3 = torch.matmul(adj_2, h_3) 68 | return h_0,h_1,h_2,h_3 69 | 70 | # Detach the return variables 71 | def embed(self, graph, graph_len): 72 | return self.gin(graph,graph_len) 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /graphdownstream/gcn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GCN(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GCN, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | 26 | # create predict layersr 27 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 28 | 29 | def create_net(self, name, input_dim, **kw): 30 | num_layers = kw.get("num_layers", 1) 31 | hidden_dim = kw.get("hidden_dim", 64) 32 | num_rels = kw.get("num_rels", 1) 33 | num_bases = kw.get("num_bases", 8) 34 | regularizer = kw.get("regularizer", "basis") 35 | dropout = kw.get("dropout", 0.5) 36 | 37 | 38 | self.convs = torch.nn.ModuleList() 39 | 40 | for i in range(num_layers): 41 | 42 | if i: 43 | conv = dgl.nn.pytorch.conv.GraphConv(in_feats=hidden_dim, out_feats=hidden_dim,allow_zero_in_degree=True) 44 | else: 45 | conv = dgl.nn.pytorch.conv.GraphConv(in_feats=input_dim, out_feats=hidden_dim,allow_zero_in_degree=True) 46 | 47 | self.convs.append(conv) 48 | 49 | return self.convs, hidden_dim 50 | 51 | 52 | #def forward(self, pattern, pattern_len, graph, graph_len): 53 | def forward(self, graph, graph_len): 54 | #bsz = pattern_len.size(0) 55 | # filter_gate选出了graph中与同构无关的节点的mask 56 | #gate = self.get_filter_gate(pattern, pattern_len, graph, graph_len) 57 | graph_output = graph.ndata["feature"] 58 | xs = [] 59 | for i in range(self.num_layers_num): 60 | graph_output = F.relu(self.convs[i](graph,graph_output)) 61 | xs.append(graph_output) 62 | xpool= [] 63 | for x in xs: 64 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 65 | graph_embedding = torch.sum(graph_embedding, dim=1) 66 | xpool.append(graph_embedding) 67 | x = torch.cat(xpool, -1) 68 | return x,torch.cat(xs, -1) 69 | -------------------------------------------------------------------------------- /graphdownstream/gin_downstream.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | from graph_prompt_layer import graph_prompt_layer_mean,graph_prompt_layer_linear_mean,graph_prompt_layer_linear_sum,\ 12 | graph_prompt_layer_sum,graph_prompt_layer_feature_weighted_mean,graph_prompt_layer_feature_weighted_sum,node_prompt_layer_feature_weighted_sum 13 | 14 | class GIN_P(torch.nn.Module): 15 | def __init__(self, config): 16 | super(GIN_P, self).__init__() 17 | 18 | # create networks 19 | # get_emb_dim 返回固定值:128,128(128为config值) 20 | # g_net为n层gcn网络,g_dim=hidden_dim 21 | self.act=torch.nn.ReLU() 22 | self.convs, self.bns, g_dim ,self.prompts = self.create_net( 23 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 24 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"],node_feature_dim = config["node_feature_dim"],) 25 | self.num_layers_num=config["gcn_graph_num_layers"] 26 | self.dropout=torch.nn.Dropout(p=config["dropout"]) 27 | 28 | # create predict layersr 29 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 30 | 31 | def create_net(self, name, input_dim, **kw): 32 | num_layers = kw.get("num_layers", 1) 33 | hidden_dim = kw.get("hidden_dim", 64) 34 | num_rels = kw.get("num_rels", 1) 35 | num_bases = kw.get("num_bases", 8) 36 | regularizer = kw.get("regularizer", "basis") 37 | dropout = kw.get("dropout", 0.5) 38 | feature_dim = kw.get("node_feature_dim",64) 39 | 40 | convs = torch.nn.ModuleList() 41 | bns = torch.nn.ModuleList() 42 | prompts = torch.nn.ModuleList() 43 | a = int((hidden_dim * num_layers)/3) 44 | prompt_1 = node_prompt_layer_feature_weighted_sum(feature_dim) 45 | prompt_2 = node_prompt_layer_feature_weighted_sum(a) 46 | prompt_3 = node_prompt_layer_feature_weighted_sum(a) 47 | prompt_4 = graph_prompt_layer_feature_weighted_sum(hidden_dim * num_layers) 48 | prompts.append(prompt_1) 49 | prompts.append(prompt_2) 50 | prompts.append(prompt_3) 51 | prompts.append(prompt_4) 52 | 53 | for i in range(num_layers): 54 | 55 | if i: 56 | nn = torch.nn.Sequential(torch.nn.Linear(hidden_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 57 | else: 58 | nn = torch.nn.Sequential(torch.nn.Linear(input_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 59 | conv = dgl.nn.pytorch.conv.GINConv(apply_func=nn,aggregator_type='sum') 60 | bn = torch.nn.BatchNorm1d(hidden_dim) 61 | 62 | convs.append(conv) 63 | bns.append(bn) 64 | 65 | return convs, bns, hidden_dim,prompts 66 | 67 | 68 | #def forward(self, pattern, pattern_len, graph, graph_len): 69 | def forward(self, graph, graph_len,prompt_id,scalar): 70 | graph_output = graph.ndata["feature"] 71 | xs = [] 72 | if prompt_id == 0: 73 | graph_output = self.prompts[0](graph_output,graph_len) 74 | for i in range(self.num_layers_num): 75 | graph_output = F.relu(self.convs[i](graph,graph_output)) 76 | graph_output = self.bns[i](graph_output) 77 | graph_output = self.dropout(graph_output) 78 | xs.append(graph_output) 79 | 80 | xpool= [] 81 | for x in xs: 82 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 83 | graph_embedding = torch.sum(graph_embedding, dim=1) 84 | xpool.append(graph_embedding) 85 | x = torch.cat(xpool, -1) 86 | #x is graph level embedding; xs is node level embedding 87 | embedding = torch.cat(xs, -1) 88 | embedding =split_and_batchify_graph_feats(embedding, graph_len)[0] 89 | embedding =embedding.mean(dim=1) 90 | return x,embedding 91 | elif prompt_id ==1: 92 | for i in range(self.num_layers_num): 93 | graph_output = F.relu(self.convs[i](graph,graph_output)) 94 | graph_output = self.bns[i](graph_output) 95 | graph_output = self.dropout(graph_output) 96 | xs.append(graph_output) 97 | if i ==0: 98 | graph_output = self.prompts[1](graph_output,graph.number_of_nodes()) 99 | 100 | 101 | xpool= [] 102 | for x in xs: 103 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 104 | graph_embedding = torch.sum(graph_embedding, dim=1) 105 | xpool.append(graph_embedding) 106 | x = torch.cat(xpool, -1) 107 | #x is graph level embedding; xs is node level embedding 108 | embedding = torch.cat(xs, -1) 109 | embedding =split_and_batchify_graph_feats(embedding, graph_len)[0] 110 | embedding =embedding.mean(dim=1) 111 | return x,embedding 112 | elif prompt_id ==2: 113 | for i in range(self.num_layers_num): 114 | graph_output = F.relu(self.convs[i](graph,graph_output)) 115 | graph_output = self.bns[i](graph_output) 116 | graph_output = self.dropout(graph_output) 117 | xs.append(graph_output) 118 | if i ==1: 119 | graph_output = self.prompts[2](graph_output,graph.number_of_nodes()) 120 | 121 | 122 | xpool= [] 123 | for x in xs: 124 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 125 | graph_embedding = torch.sum(graph_embedding, dim=1) 126 | xpool.append(graph_embedding) 127 | x = torch.cat(xpool, -1) 128 | embedding = torch.cat(xs, -1) 129 | embedding =split_and_batchify_graph_feats(embedding, graph_len)[0] 130 | embedding =embedding.mean(dim=1) 131 | #x is graph level embedding; xs is node level embedding 132 | return x,embedding 133 | elif prompt_id == 3: 134 | for i in range(self.num_layers_num): 135 | graph_output = F.relu(self.convs[i](graph,graph_output)) 136 | graph_output = self.bns[i](graph_output) 137 | graph_output = self.dropout(graph_output) 138 | xs.append(graph_output) 139 | 140 | xpool= [] 141 | for x in xs: 142 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 143 | graph_embedding = torch.sum(graph_embedding, dim=1) 144 | xpool.append(graph_embedding) 145 | x = torch.cat(xpool, -1) 146 | embedding = torch.cat(xs, -1) 147 | embedding = self.prompts[3](embedding, graph_len)*scalar 148 | 149 | #x is graph level embedding; xs is node level embedding 150 | return x,embedding 151 | else: 152 | for i in range(self.num_layers_num): 153 | graph_output = F.relu(self.convs[i](graph,graph_output)) 154 | graph_output = self.bns[i](graph_output) 155 | graph_output = self.dropout(graph_output) 156 | xs.append(graph_output) 157 | 158 | 159 | 160 | xpool= [] 161 | for x in xs: 162 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 163 | graph_embedding = torch.sum(graph_embedding, dim=1) 164 | xpool.append(graph_embedding) 165 | x = torch.cat(xpool, -1) 166 | embedding = torch.cat(xs, -1) 167 | #x is graph level embedding; xs is node level embedding 168 | return x,embedding 169 | 170 | 171 | -------------------------------------------------------------------------------- /graphdownstream/gin_local.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GIN(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GIN, self).__init__() 16 | self.sigm = nn.Sigmoid() 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, self.bns, g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | self.dropout=torch.nn.Dropout(p=config["dropout"]) 26 | 27 | # create predict layersr 28 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 29 | 30 | def create_net(self, name, input_dim, **kw): 31 | num_layers = kw.get("num_layers", 1) 32 | hidden_dim = kw.get("hidden_dim", 64) 33 | num_rels = kw.get("num_rels", 1) 34 | num_bases = kw.get("num_bases", 8) 35 | regularizer = kw.get("regularizer", "basis") 36 | dropout = kw.get("dropout", 0.5) 37 | 38 | 39 | self.convs = torch.nn.ModuleList() 40 | self.bns = torch.nn.ModuleList() 41 | 42 | for i in range(num_layers): 43 | 44 | if i: 45 | nn = torch.nn.Sequential(torch.nn.Linear(hidden_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 46 | else: 47 | nn = torch.nn.Sequential(torch.nn.Linear(input_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 48 | conv = dgl.nn.pytorch.conv.GINConv(apply_func=nn,aggregator_type='sum') 49 | bn = torch.nn.BatchNorm1d(hidden_dim) 50 | 51 | self.convs.append(conv) 52 | self.bns.append(bn) 53 | 54 | return self.convs, self.bns, hidden_dim 55 | 56 | 57 | #def forward(self, pattern, pattern_len, graph, graph_len): 58 | def forward(self, graph, graph_len): 59 | graph_output = graph.ndata["feature"] 60 | xs = [] 61 | for i in range(self.num_layers_num): 62 | graph_output = F.relu(self.convs[i](graph,graph_output)) 63 | graph_output = self.bns[i](graph_output) 64 | graph_output = self.dropout(graph_output) 65 | xs.append(graph_output) 66 | xpool= [] 67 | for x in xs: 68 | x = self.sigm(x) 69 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 70 | graph_embedding = torch.sum(graph_embedding, dim=1) 71 | xpool.append(graph_embedding) 72 | x = torch.cat(xpool, -1) 73 | #x is graph level embedding; xs is node level embedding 74 | return x,torch.cat(xs, -1) -------------------------------------------------------------------------------- /graphdownstream/graph_finetuning_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | import math 11 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 12 | 13 | class graph_finetuning_layer(nn.Module): 14 | def __init__(self, input_dim,output_dim): 15 | super(graph_finetuning_layer, self).__init__() 16 | self.linear=torch.nn.Linear(input_dim,output_dim) 17 | #self.dropout=torch.nn.Dropout(0.2) 18 | 19 | 20 | def forward(self,graph_embedding, graph_len): 21 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 22 | graph_embedding=torch.sum(graph_embedding,dim=1) 23 | #not the follows problem 24 | graph_embedding=self.linear(graph_embedding) 25 | #graph_embedding=torch.nn.functional.normalize(graph_embedding,dim=1) 26 | #graph_embedding=F.leaky_relu(graph_embedding,0.2) 27 | result = F.log_softmax(graph_embedding, dim=1) 28 | return result 29 | 30 | 31 | '''class graph_finetuning_layer(nn.Module): 32 | def __init__(self, input_dim,output_dim): 33 | super(graph_finetuning_layer, self).__init__() 34 | self.linear=torch.nn.Linear(input_dim,output_dim) 35 | self.softmax=torch.nn.Softmax(dim=1) 36 | 37 | def forward(self,graph_embedding, graph_len): 38 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 39 | graph_embedding=torch.sum(graph_embedding,dim=1) 40 | graph_embedding=self.linear(graph_embedding) 41 | graph_embedding=F.leaky_relu(graph_embedding) 42 | graph_embedding=F.log_softmax(graph_embedding,dim=1) 43 | #graph_embedding=F.softmax(graph_embedding,dim=1) 44 | #graph_embedding=torch.argmax(graph_embedding,dim=1,keepdim=True).float() 45 | #result=self.softmax(F.leaky_relu(graph_embedding)) 46 | #index=result.permute(1,0)[0] 47 | #index=index.unsqueeze(dim=1) 48 | index=torch.argmax(graph_embedding,dim=1,keepdim=True).float() 49 | index.requires_grad_(True) 50 | print(index.requires_grad) 51 | #return result 52 | #return graph_embedding 53 | return index''' -------------------------------------------------------------------------------- /graphdownstream/graph_prompt_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | #use prompt to finish step 1 13 | class graph_prompt_layer_mean(nn.Module): 14 | def __init__(self): 15 | super(graph_prompt_layer_mean, self).__init__() 16 | #先给予一个不用改的权重 17 | self.weight= torch.nn.Parameter(torch.Tensor(2, 2)) 18 | def forward(self, graph_embedding, graph_len): 19 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 20 | #prompt: mean 21 | graph_prompt_result=graph_embedding.mean(dim=1) 22 | return graph_prompt_result 23 | 24 | class graph_prompt_layer_linear_mean(nn.Module): 25 | def __init__(self,input_dim,output_dim): 26 | super(graph_prompt_layer_linear_mean, self).__init__() 27 | #先给予一个不用改的权重 28 | self.linear=torch.nn.Linear(input_dim,output_dim) 29 | 30 | def forward(self, graph_embedding, graph_len): 31 | graph_embedding=self.linear(graph_embedding) 32 | 33 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 34 | graph_prompt_result=graph_embedding.mean(dim=1) 35 | graph_prompt_result=torch.nn.functional.normalize(graph_prompt_result,dim=1) 36 | return graph_prompt_result 37 | 38 | class graph_prompt_layer_linear_sum(nn.Module): 39 | def __init__(self,input_dim,output_dim): 40 | super(graph_prompt_layer_linear_sum, self).__init__() 41 | #先给予一个不用改的权重 42 | self.linear=torch.nn.Linear(input_dim,output_dim) 43 | 44 | def forward(self, graph_embedding, graph_len): 45 | graph_embedding=self.linear(graph_embedding) 46 | 47 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 48 | graph_prompt_result=graph_embedding.sum(dim=1) 49 | graph_prompt_result=torch.nn.functional.normalize(graph_prompt_result,dim=1) 50 | return graph_prompt_result 51 | 52 | 53 | 54 | #sum result is same as mean result 55 | class graph_prompt_layer_sum(nn.Module): 56 | def __init__(self): 57 | super(graph_prompt_layer_sum, self).__init__() 58 | #先给予一个不用改的权重 59 | self.weight= torch.nn.Parameter(torch.Tensor(2, 2)) 60 | def forward(self, graph_embedding, graph_len): 61 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 62 | #prompt: sum 63 | graph_prompt_result=graph_embedding.sum(dim=1) 64 | return graph_prompt_result 65 | 66 | #7.26 error verison 算法错误,没法解释,但可能其结果是有用的 67 | '''class graph_prompt_layer_weighted(nn.Module): 68 | def __init__(self,max_n_num): 69 | super(graph_prompt_layer_weighted, self).__init__() 70 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 71 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 72 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 73 | self.max_n_num=max_n_num 74 | self.reset_parameters() 75 | def reset_parameters(self): 76 | torch.nn.init.xavier_uniform_(self.weight) 77 | def forward(self, graph_embedding, graph_len): 78 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 79 | weight = self.weight[0][0:graph_embedding.size(1)] 80 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 81 | temp1 = weight * temp1 82 | graph_embedding=torch.matmul(graph_embedding,temp1) 83 | #prompt: mean 84 | graph_prompt_result=graph_embedding.mean(dim=1) 85 | return graph_prompt_result''' 86 | 87 | 88 | class graph_prompt_layer_weighted(nn.Module): 89 | def __init__(self,max_n_num): 90 | super(graph_prompt_layer_weighted, self).__init__() 91 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 92 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 93 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 94 | self.max_n_num=max_n_num 95 | self.reset_parameters() 96 | def reset_parameters(self): 97 | torch.nn.init.xavier_uniform_(self.weight) 98 | def forward(self, graph_embedding, graph_len): 99 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 100 | weight = self.weight[0][0:graph_embedding.size(1)] 101 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 102 | temp1 = weight * temp1 103 | temp1 = temp1.permute(0, 2, 1) 104 | graph_embedding=graph_embedding*temp1 105 | #prompt: mean 106 | graph_prompt_result=graph_embedding.sum(dim=1) 107 | return graph_prompt_result 108 | 109 | class graph_prompt_layer_feature_weighted_mean(nn.Module): 110 | def __init__(self,input_dim): 111 | super(graph_prompt_layer_feature_weighted_mean, self).__init__() 112 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 113 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 114 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 115 | self.max_n_num=input_dim 116 | self.reset_parameters() 117 | def reset_parameters(self): 118 | torch.nn.init.xavier_uniform_(self.weight) 119 | def forward(self, graph_embedding, graph_len): 120 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 121 | graph_embedding=graph_embedding*self.weight 122 | #prompt: mean 123 | graph_prompt_result=graph_embedding.mean(dim=1) 124 | return graph_prompt_result 125 | 126 | class graph_prompt_layer_feature_weighted_sum(nn.Module): 127 | def __init__(self,input_dim): 128 | super(graph_prompt_layer_feature_weighted_sum, self).__init__() 129 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 130 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 131 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 132 | self.max_n_num=input_dim 133 | self.reset_parameters() 134 | def reset_parameters(self): 135 | torch.nn.init.xavier_uniform_(self.weight) 136 | def forward(self, graph_embedding, graph_len): 137 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 138 | graph_embedding=graph_embedding*self.weight 139 | #prompt: mean 140 | graph_prompt_result=graph_embedding.sum(dim=1) 141 | return graph_prompt_result 142 | 143 | class graph_prompt_layer_weighted_matrix(nn.Module): 144 | def __init__(self,max_n_num,input_dim): 145 | super(graph_prompt_layer_weighted_matrix, self).__init__() 146 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 147 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 148 | self.weight= torch.nn.Parameter(torch.Tensor(input_dim,max_n_num)) 149 | self.max_n_num=max_n_num 150 | self.reset_parameters() 151 | def reset_parameters(self): 152 | torch.nn.init.xavier_uniform_(self.weight) 153 | def forward(self, graph_embedding, graph_len): 154 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 155 | weight = self.weight.permute(1, 0)[0:graph_embedding.size(1)] 156 | weight = weight.expand(graph_embedding.size(0), weight.size(0), weight.size(1)) 157 | graph_embedding = graph_embedding * weight 158 | #prompt: mean 159 | graph_prompt_result=graph_embedding.sum(dim=1) 160 | return graph_prompt_result 161 | 162 | class graph_prompt_layer_weighted_linear(nn.Module): 163 | def __init__(self,max_n_num,input_dim,output_dim): 164 | super(graph_prompt_layer_weighted_linear, self).__init__() 165 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 166 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 167 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 168 | self.linear=nn.Linear(input_dim,output_dim) 169 | self.max_n_num=max_n_num 170 | self.reset_parameters() 171 | def reset_parameters(self): 172 | torch.nn.init.xavier_uniform_(self.weight) 173 | def forward(self, graph_embedding, graph_len): 174 | graph_embedding=self.linear(graph_embedding) 175 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 176 | weight = self.weight[0][0:graph_embedding.size(1)] 177 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 178 | temp1 = weight * temp1 179 | temp1 = temp1.permute(0, 2, 1) 180 | graph_embedding=graph_embedding*temp1 181 | #prompt: mean 182 | graph_prompt_result = graph_embedding.mean(dim=1) 183 | return graph_prompt_result 184 | 185 | class graph_prompt_layer_weighted_matrix_linear(nn.Module): 186 | def __init__(self,max_n_num,input_dim,output_dim): 187 | super(graph_prompt_layer_weighted_matrix_linear, self).__init__() 188 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 189 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 190 | self.weight= torch.nn.Parameter(torch.Tensor(output_dim,max_n_num)) 191 | self.linear=nn.Linear(input_dim,output_dim) 192 | self.max_n_num=max_n_num 193 | self.reset_parameters() 194 | def reset_parameters(self): 195 | torch.nn.init.xavier_uniform_(self.weight) 196 | def forward(self, graph_embedding, graph_len): 197 | graph_embedding=self.linear(graph_embedding) 198 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 199 | weight = self.weight.permute(1, 0)[0:graph_embedding.size(1)] 200 | weight = weight.expand(graph_embedding.size(0), weight.size(0), weight.size(1)) 201 | graph_embedding = graph_embedding * weight 202 | #prompt: mean 203 | graph_prompt_result=graph_embedding.mean(dim=1) 204 | return graph_prompt_result 205 | class node_prompt_layer_feature_weighted_sum(nn.Module): 206 | def __init__(self,input_dim): 207 | super(node_prompt_layer_feature_weighted_sum, self).__init__() 208 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 209 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 210 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 211 | self.max_n_num=input_dim 212 | self.reset_parameters() 213 | def reset_parameters(self): 214 | torch.nn.init.xavier_uniform_(self.weight) 215 | def forward(self, graph_embedding, graph_len): 216 | graph_embedding=graph_embedding*self.weight 217 | #prompt: mean 218 | return graph_embedding -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/aug.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import copy 3 | import random 4 | import pdb 5 | import scipy.sparse as sp 6 | import numpy as np 7 | 8 | def main(): 9 | pass 10 | 11 | 12 | def aug_random_mask(input_feature, drop_percent=0.2): 13 | 14 | node_num = input_feature.shape[1] 15 | mask_num = int(node_num * drop_percent) 16 | node_idx = [i for i in range(node_num)] 17 | mask_idx = random.sample(node_idx, mask_num) 18 | aug_feature = copy.deepcopy(input_feature) 19 | zeros = torch.zeros_like(aug_feature[0][0]) 20 | for j in mask_idx: 21 | aug_feature[0][j] = zeros 22 | return aug_feature 23 | 24 | 25 | def aug_random_edge(input_adj, drop_percent=0.2): 26 | 27 | percent = drop_percent / 2 28 | row_idx, col_idx = input_adj.nonzero() 29 | 30 | index_list = [] 31 | for i in range(len(row_idx)): 32 | index_list.append((row_idx[i], col_idx[i])) 33 | 34 | single_index_list = [] 35 | for i in list(index_list): 36 | single_index_list.append(i) 37 | index_list.remove((i[1], i[0])) 38 | 39 | 40 | edge_num = int(len(row_idx) / 2) # 9228 / 2 41 | add_drop_num = int(edge_num * percent / 2) 42 | aug_adj = copy.deepcopy(input_adj.todense().tolist()) 43 | 44 | edge_idx = [i for i in range(edge_num)] 45 | drop_idx = random.sample(edge_idx, add_drop_num) 46 | 47 | 48 | for i in drop_idx: 49 | aug_adj[single_index_list[i][0]][single_index_list[i][1]] = 0 50 | aug_adj[single_index_list[i][1]][single_index_list[i][0]] = 0 51 | 52 | ''' 53 | above finish drop edges 54 | ''' 55 | node_num = input_adj.shape[0] 56 | l = [(i, j) for i in range(node_num) for j in range(i)] 57 | add_list = random.sample(l, add_drop_num) 58 | 59 | for i in add_list: 60 | 61 | aug_adj[i[0]][i[1]] = 1 62 | aug_adj[i[1]][i[0]] = 1 63 | 64 | aug_adj = np.matrix(aug_adj) 65 | aug_adj = sp.csr_matrix(aug_adj) 66 | return aug_adj 67 | 68 | 69 | def aug_drop_node(input_fea, input_adj, drop_percent=0.2): 70 | 71 | input_adj = torch.tensor(input_adj.todense().tolist()) 72 | input_fea = input_fea.squeeze(0) 73 | 74 | node_num = input_fea.shape[0] 75 | drop_num = int(node_num * drop_percent) # number of drop nodes 76 | all_node_list = [i for i in range(node_num)] 77 | 78 | drop_node_list = sorted(random.sample(all_node_list, drop_num)) 79 | 80 | aug_input_fea = delete_row_col(input_fea, drop_node_list, only_row=True) 81 | aug_input_adj = delete_row_col(input_adj, drop_node_list) 82 | 83 | aug_input_fea = aug_input_fea.unsqueeze(0) 84 | aug_input_adj = sp.csr_matrix(np.matrix(aug_input_adj)) 85 | 86 | return aug_input_fea, aug_input_adj 87 | 88 | 89 | def aug_subgraph(input_fea, input_adj, drop_percent=0.2): 90 | 91 | input_adj = input_adj.todense().tolist() 92 | input_fea = input_fea.squeeze(0) 93 | node_num = input_fea.shape[0] 94 | 95 | all_node_list = [i for i in range(node_num)] 96 | s_node_num = int(node_num * (1 - drop_percent)) 97 | center_node_id = random.randint(0, node_num - 1) 98 | sub_node_id_list = [center_node_id] 99 | all_neighbor_list = [] 100 | 101 | for i in range(s_node_num - 1): 102 | 103 | all_neighbor_list += torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False).squeeze(1).tolist() 104 | 105 | all_neighbor_list = list(set(all_neighbor_list)) 106 | new_neighbor_list = [n for n in all_neighbor_list if not n in sub_node_id_list] 107 | if len(new_neighbor_list) != 0: 108 | new_node = random.sample(new_neighbor_list, 1)[0] 109 | sub_node_id_list.append(new_node) 110 | else: 111 | break 112 | 113 | 114 | drop_node_list = sorted([i for i in all_node_list if not i in sub_node_id_list]) 115 | 116 | aug_input_fea = delete_row_col(input_fea, drop_node_list, only_row=True) 117 | aug_input_adj = delete_row_col(input_adj, drop_node_list) 118 | 119 | aug_input_fea = aug_input_fea.unsqueeze(0) 120 | aug_input_adj = sp.csr_matrix(np.matrix(aug_input_adj)) 121 | 122 | return aug_input_fea, aug_input_adj 123 | 124 | 125 | 126 | 127 | 128 | def delete_row_col(input_matrix, drop_list, only_row=False): 129 | 130 | remain_list = [i for i in range(input_matrix.shape[0]) if i not in drop_list] 131 | out = input_matrix[remain_list, :] 132 | if only_row: 133 | return out 134 | out = out[:, remain_list] 135 | 136 | return out 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | if __name__ == "__main__": 157 | main() 158 | 159 | -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/dgi.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import GCN, AvgReadout, Discriminator, Discriminator2 4 | import pdb 5 | 6 | class DGI(nn.Module): 7 | def __init__(self, n_in, n_h, activation): 8 | super(DGI, self).__init__() 9 | self.gcn = GCN(n_in, n_h, activation) 10 | self.read = AvgReadout() 11 | self.sigm = nn.Sigmoid() 12 | self.disc = Discriminator(n_h) 13 | self.disc2 = Discriminator2(n_h) 14 | 15 | def forward(self, seq1, seq2, seq3, seq4, adj, aug_adj1, aug_adj2, sparse, msk, samp_bias1, samp_bias2, aug_type): 16 | 17 | h_0 = self.gcn(seq1, adj, sparse) 18 | if aug_type == 'edge': 19 | 20 | h_1 = self.gcn(seq1, aug_adj1, sparse) 21 | h_3 = self.gcn(seq1, aug_adj2, sparse) 22 | 23 | elif aug_type == 'mask': 24 | 25 | h_1 = self.gcn(seq3, adj, sparse) 26 | h_3 = self.gcn(seq4, adj, sparse) 27 | 28 | elif aug_type == 'node' or aug_type == 'subgraph': 29 | 30 | h_1 = self.gcn(seq3, aug_adj1, sparse) 31 | h_3 = self.gcn(seq4, aug_adj2, sparse) 32 | 33 | else: 34 | assert False 35 | 36 | c_1 = self.read(h_1, msk) 37 | c_1= self.sigm(c_1) 38 | 39 | c_3 = self.read(h_3, msk) 40 | c_3= self.sigm(c_3) 41 | 42 | h_2 = self.gcn(seq2, adj, sparse) 43 | 44 | ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) 45 | ret2 = self.disc(c_3, h_0, h_2, samp_bias1, samp_bias2) 46 | 47 | ret = ret1 + ret2 48 | return ret 49 | 50 | # Detach the return variables 51 | def embed(self, seq, adj, sparse, msk): 52 | h_1 = self.gcn(seq, adj, sparse) 53 | c = self.read(h_1, msk) 54 | 55 | return h_1.detach(), c.detach() 56 | 57 | -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/execute.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import scipy.sparse as sp 3 | import torch 4 | import torch.nn as nn 5 | import random 6 | from models import DGI, LogReg 7 | from utils import process 8 | import pdb 9 | import aug 10 | import os 11 | import argparse 12 | 13 | 14 | 15 | parser = argparse.ArgumentParser("My DGI") 16 | 17 | parser.add_argument('--dataset', type=str, default="", help='data') 18 | parser.add_argument('--aug_type', type=str, default="", help='aug type: mask or edge') 19 | parser.add_argument('--drop_percent', type=float, default=0.1, help='drop percent') 20 | parser.add_argument('--seed', type=int, default=39, help='seed') 21 | parser.add_argument('--gpu', type=int, default=0, help='gpu') 22 | parser.add_argument('--save_name', type=str, default='try.pkl', help='save ckpt name') 23 | 24 | args = parser.parse_args() 25 | 26 | print('-' * 100) 27 | print(args) 28 | print('-' * 100) 29 | 30 | dataset = args.dataset 31 | aug_type = args.aug_type 32 | drop_percent = args.drop_percent 33 | os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID" 34 | os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu) 35 | seed = args.seed 36 | random.seed(seed) 37 | np.random.seed(seed) 38 | torch.manual_seed(seed) 39 | torch.cuda.manual_seed(seed) 40 | 41 | # training params 42 | 43 | 44 | batch_size = 1 45 | nb_epochs = 10000 46 | patience = 20 47 | lr = 0.001 48 | l2_coef = 0.0 49 | drop_prob = 0.0 50 | hid_units = 512 51 | sparse = True 52 | 53 | 54 | nonlinearity = 'prelu' # special name to separate parameters 55 | adj, features, labels, idx_train, idx_val, idx_test = process.load_data(dataset) 56 | features, _ = process.preprocess_features(features) 57 | 58 | nb_nodes = features.shape[0] # node number 59 | ft_size = features.shape[1] # node features dim 60 | nb_classes = labels.shape[1] # classes = 6 61 | 62 | features = torch.FloatTensor(features[np.newaxis]) 63 | 64 | 65 | ''' 66 | ------------------------------------------------------------ 67 | edge node mask subgraph 68 | ------------------------------------------------------------ 69 | ''' 70 | print("Begin Aug:[{}]".format(args.aug_type)) 71 | if args.aug_type == 'edge': 72 | 73 | aug_features1 = features 74 | aug_features2 = features 75 | 76 | aug_adj1 = aug.aug_random_edge(adj, drop_percent=drop_percent) # random drop edges 77 | aug_adj2 = aug.aug_random_edge(adj, drop_percent=drop_percent) # random drop edges 78 | 79 | elif args.aug_type == 'node': 80 | 81 | aug_features1, aug_adj1 = aug.aug_drop_node(features, adj, drop_percent=drop_percent) 82 | aug_features2, aug_adj2 = aug.aug_drop_node(features, adj, drop_percent=drop_percent) 83 | 84 | elif args.aug_type == 'subgraph': 85 | 86 | aug_features1, aug_adj1 = aug.aug_subgraph(features, adj, drop_percent=drop_percent) 87 | aug_features2, aug_adj2 = aug.aug_subgraph(features, adj, drop_percent=drop_percent) 88 | 89 | elif args.aug_type == 'mask': 90 | 91 | aug_features1 = aug.aug_random_mask(features, drop_percent=drop_percent) 92 | aug_features2 = aug.aug_random_mask(features, drop_percent=drop_percent) 93 | 94 | aug_adj1 = adj 95 | aug_adj2 = adj 96 | 97 | else: 98 | assert False 99 | 100 | 101 | 102 | ''' 103 | ------------------------------------------------------------ 104 | ''' 105 | 106 | adj = process.normalize_adj(adj + sp.eye(adj.shape[0])) 107 | aug_adj1 = process.normalize_adj(aug_adj1 + sp.eye(aug_adj1.shape[0])) 108 | aug_adj2 = process.normalize_adj(aug_adj2 + sp.eye(aug_adj2.shape[0])) 109 | 110 | if sparse: 111 | sp_adj = process.sparse_mx_to_torch_sparse_tensor(adj) 112 | sp_aug_adj1 = process.sparse_mx_to_torch_sparse_tensor(aug_adj1) 113 | sp_aug_adj2 = process.sparse_mx_to_torch_sparse_tensor(aug_adj2) 114 | 115 | else: 116 | adj = (adj + sp.eye(adj.shape[0])).todense() 117 | aug_adj1 = (aug_adj1 + sp.eye(aug_adj1.shape[0])).todense() 118 | aug_adj2 = (aug_adj2 + sp.eye(aug_adj2.shape[0])).todense() 119 | 120 | 121 | ''' 122 | ------------------------------------------------------------ 123 | mask 124 | ------------------------------------------------------------ 125 | ''' 126 | 127 | ''' 128 | ------------------------------------------------------------ 129 | ''' 130 | if not sparse: 131 | adj = torch.FloatTensor(adj[np.newaxis]) 132 | aug_adj1 = torch.FloatTensor(aug_adj1[np.newaxis]) 133 | aug_adj2 = torch.FloatTensor(aug_adj2[np.newaxis]) 134 | 135 | 136 | labels = torch.FloatTensor(labels[np.newaxis]) 137 | idx_train = torch.LongTensor(idx_train) 138 | idx_val = torch.LongTensor(idx_val) 139 | idx_test = torch.LongTensor(idx_test) 140 | 141 | model = DGI(ft_size, hid_units, nonlinearity) 142 | optimiser = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=l2_coef) 143 | 144 | if torch.cuda.is_available(): 145 | print('Using CUDA') 146 | model.cuda() 147 | features = features.cuda() 148 | aug_features1 = aug_features1.cuda() 149 | aug_features2 = aug_features2.cuda() 150 | if sparse: 151 | sp_adj = sp_adj.cuda() 152 | sp_aug_adj1 = sp_aug_adj1.cuda() 153 | sp_aug_adj2 = sp_aug_adj2.cuda() 154 | else: 155 | adj = adj.cuda() 156 | aug_adj1 = aug_adj1.cuda() 157 | aug_adj2 = aug_adj2.cuda() 158 | 159 | labels = labels.cuda() 160 | idx_train = idx_train.cuda() 161 | idx_val = idx_val.cuda() 162 | idx_test = idx_test.cuda() 163 | 164 | b_xent = nn.BCEWithLogitsLoss() 165 | xent = nn.CrossEntropyLoss() 166 | cnt_wait = 0 167 | best = 1e9 168 | best_t = 0 169 | 170 | for epoch in range(nb_epochs): 171 | 172 | model.train() 173 | optimiser.zero_grad() 174 | 175 | idx = np.random.permutation(nb_nodes) 176 | shuf_fts = features[:, idx, :] 177 | 178 | lbl_1 = torch.ones(batch_size, nb_nodes) 179 | lbl_2 = torch.zeros(batch_size, nb_nodes) 180 | lbl = torch.cat((lbl_1, lbl_2), 1) 181 | 182 | if torch.cuda.is_available(): 183 | shuf_fts = shuf_fts.cuda() 184 | lbl = lbl.cuda() 185 | 186 | logits = model(features, shuf_fts, aug_features1, aug_features2, 187 | sp_adj if sparse else adj, 188 | sp_aug_adj1 if sparse else aug_adj1, 189 | sp_aug_adj2 if sparse else aug_adj2, 190 | sparse, None, None, None, aug_type=aug_type) 191 | 192 | loss = b_xent(logits, lbl) 193 | print('Loss:[{:.4f}]'.format(loss.item())) 194 | 195 | if loss < best: 196 | best = loss 197 | best_t = epoch 198 | cnt_wait = 0 199 | torch.save(model.state_dict(), args.save_name) 200 | else: 201 | cnt_wait += 1 202 | 203 | if cnt_wait == patience: 204 | print('Early stopping!') 205 | break 206 | 207 | loss.backward() 208 | optimiser.step() 209 | 210 | print('Loading {}th epoch'.format(best_t)) 211 | model.load_state_dict(torch.load(args.save_name)) 212 | 213 | embeds, _ = model.embed(features, sp_adj if sparse else adj, sparse, None) 214 | train_embs = embeds[0, idx_train] 215 | val_embs = embeds[0, idx_val] 216 | test_embs = embeds[0, idx_test] 217 | 218 | train_lbls = torch.argmax(labels[0, idx_train], dim=1) 219 | val_lbls = torch.argmax(labels[0, idx_val], dim=1) 220 | test_lbls = torch.argmax(labels[0, idx_test], dim=1) 221 | 222 | tot = torch.zeros(1) 223 | tot = tot.cuda() 224 | 225 | accs = [] 226 | 227 | for _ in range(50): 228 | log = LogReg(hid_units, nb_classes) 229 | opt = torch.optim.Adam(log.parameters(), lr=0.01, weight_decay=0.0) 230 | log.cuda() 231 | 232 | pat_steps = 0 233 | best_acc = torch.zeros(1) 234 | best_acc = best_acc.cuda() 235 | for _ in range(100): 236 | log.train() 237 | opt.zero_grad() 238 | 239 | logits = log(train_embs) 240 | loss = xent(logits, train_lbls) 241 | 242 | loss.backward() 243 | opt.step() 244 | 245 | logits = log(test_embs) 246 | preds = torch.argmax(logits, dim=1) 247 | acc = torch.sum(preds == test_lbls).float() / test_lbls.shape[0] 248 | accs.append(acc * 100) 249 | print('acc:[{:.4f}]'.format(acc)) 250 | tot += acc 251 | 252 | print('-' * 100) 253 | print('Average accuracy:[{:.4f}]'.format(tot.item() / 50)) 254 | accs = torch.stack(accs) 255 | print('Mean:[{:.4f}]'.format(accs.mean().item())) 256 | print('Std :[{:.4f}]'.format(accs.std().item())) 257 | print('-' * 100) 258 | 259 | 260 | -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .gcn import GCN 2 | from .readout import AvgReadout 3 | from .discriminator import Discriminator 4 | from .discriminator2 import Discriminator2 -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/graphcl-pretrain/layers/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/__pycache__/discriminator.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/graphcl-pretrain/layers/__pycache__/discriminator.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/__pycache__/discriminator2.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/graphcl-pretrain/layers/__pycache__/discriminator2.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/__pycache__/gcn.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/graphcl-pretrain/layers/__pycache__/gcn.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/__pycache__/readout.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/graphcl-pretrain/layers/__pycache__/readout.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/discriminator.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | c_x = torch.unsqueeze(c, 1) 20 | c_x = c_x.expand_as(h_pl) 21 | 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/discriminator2.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator2(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator2, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | # c_x = torch.unsqueeze(c, 1) 20 | # c_x = c_x.expand_as(h_pl) 21 | c_x = c 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/gcn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class GCN(nn.Module): 5 | def __init__(self, in_ft, out_ft, act, bias=True): 6 | super(GCN, self).__init__() 7 | self.fc = nn.Linear(in_ft, out_ft, bias=False) 8 | self.act = nn.PReLU() if act == 'prelu' else act 9 | 10 | if bias: 11 | self.bias = nn.Parameter(torch.FloatTensor(out_ft)) 12 | self.bias.data.fill_(0.0) 13 | else: 14 | self.register_parameter('bias', None) 15 | 16 | for m in self.modules(): 17 | self.weights_init(m) 18 | 19 | def weights_init(self, m): 20 | if isinstance(m, nn.Linear): 21 | torch.nn.init.xavier_uniform_(m.weight.data) 22 | if m.bias is not None: 23 | m.bias.data.fill_(0.0) 24 | 25 | # Shape of seq: (batch, nodes, features) 26 | def forward(self, seq, adj, sparse=False): 27 | seq_fts = self.fc(seq) 28 | if sparse: 29 | out = torch.unsqueeze(torch.spmm(adj, torch.squeeze(seq_fts, 0)), 0) 30 | else: 31 | out = torch.bmm(adj, seq_fts) 32 | if self.bias is not None: 33 | out += self.bias 34 | 35 | return self.act(out) 36 | 37 | -------------------------------------------------------------------------------- /graphdownstream/graphcl-pretrain/layers/readout.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | # Applies an average on seq, of shape (batch, nodes, features) 5 | # While taking into account the masking of msk 6 | class AvgReadout(nn.Module): 7 | def __init__(self): 8 | super(AvgReadout, self).__init__() 9 | 10 | def forward(self, seq, msk): 11 | if msk is None: 12 | return torch.mean(seq, 1) 13 | else: 14 | msk = torch.unsqueeze(msk, -1) 15 | return torch.sum(seq * msk, 1) / torch.sum(msk) 16 | 17 | -------------------------------------------------------------------------------- /graphdownstream/graphsage.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class Graphsage(torch.nn.Module): 14 | def __init__(self, config): 15 | super(Graphsage, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | 26 | # create predict layersr 27 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 28 | 29 | def create_net(self, name, input_dim, **kw): 30 | num_layers = kw.get("num_layers", 1) 31 | hidden_dim = kw.get("hidden_dim", 64) 32 | num_rels = kw.get("num_rels", 1) 33 | num_bases = kw.get("num_bases", 8) 34 | regularizer = kw.get("regularizer", "basis") 35 | dropout = kw.get("dropout", 0.5) 36 | 37 | 38 | self.convs = torch.nn.ModuleList() 39 | 40 | for i in range(num_layers): 41 | 42 | if i: 43 | conv = dgl.nn.pytorch.conv.SAGEConv(in_feats=hidden_dim, out_feats=hidden_dim,aggregator_type="gcn") 44 | else: 45 | conv = dgl.nn.pytorch.conv.SAGEConv(in_feats=input_dim, out_feats=hidden_dim,aggregator_type="gcn") 46 | 47 | self.convs.append(conv) 48 | 49 | return self.convs, hidden_dim 50 | 51 | 52 | #def forward(self, pattern, pattern_len, graph, graph_len): 53 | def forward(self, graph, graph_len): 54 | #bsz = pattern_len.size(0) 55 | # filter_gate选出了graph中与同构无关的节点的mask 56 | #gate = self.get_filter_gate(pattern, pattern_len, graph, graph_len) 57 | graph_output = graph.ndata["feature"] 58 | xs = [] 59 | for i in range(self.num_layers_num): 60 | graph_output = F.relu(self.convs[i](graph,graph_output)) 61 | xs.append(graph_output) 62 | xpool= [] 63 | for x in xs: 64 | graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 65 | graph_embedding = torch.sum(graph_embedding, dim=1) 66 | xpool.append(graph_embedding) 67 | x = torch.cat(xpool, -1) 68 | return x,torch.cat(xs, -1) 69 | -------------------------------------------------------------------------------- /graphdownstream/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .gcn import GCN 2 | from .readout import AvgReadout 3 | from .discriminator import Discriminator 4 | from .discriminator2 import Discriminator2 -------------------------------------------------------------------------------- /graphdownstream/layers/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/layers/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/layers/__pycache__/discriminator.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/layers/__pycache__/discriminator.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/layers/__pycache__/discriminator2.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/layers/__pycache__/discriminator2.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/layers/__pycache__/gcn.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/layers/__pycache__/gcn.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/layers/__pycache__/readout.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/graphdownstream/layers/__pycache__/readout.cpython-36.pyc -------------------------------------------------------------------------------- /graphdownstream/layers/discriminator.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | c_x = torch.unsqueeze(c, 1) 20 | c_x = c_x.expand_as(h_pl) 21 | 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /graphdownstream/layers/discriminator2.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator2(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator2, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | # c_x = torch.unsqueeze(c, 1) 20 | # c_x = c_x.expand_as(h_pl) 21 | c_x = c 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /graphdownstream/layers/gcn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class GCN(nn.Module): 5 | def __init__(self, in_ft, out_ft, act, bias=True): 6 | super(GCN, self).__init__() 7 | self.fc = nn.Linear(in_ft, out_ft, bias=False) 8 | self.act = nn.PReLU() if act == 'prelu' else act 9 | 10 | if bias: 11 | self.bias = nn.Parameter(torch.FloatTensor(out_ft)) 12 | self.bias.data.fill_(0.0) 13 | else: 14 | self.register_parameter('bias', None) 15 | 16 | for m in self.modules(): 17 | self.weights_init(m) 18 | 19 | def weights_init(self, m): 20 | if isinstance(m, nn.Linear): 21 | torch.nn.init.xavier_uniform_(m.weight.data) 22 | if m.bias is not None: 23 | m.bias.data.fill_(0.0) 24 | 25 | # Shape of seq: (batch, nodes, features) 26 | def forward(self, seq, adj, sparse=False): 27 | seq_fts = self.fc(seq) 28 | if sparse: 29 | out = torch.unsqueeze(torch.spmm(adj, torch.squeeze(seq_fts, 0)), 0) 30 | else: 31 | out = torch.bmm(adj, seq_fts) 32 | if self.bias is not None: 33 | out += self.bias 34 | 35 | return self.act(out) 36 | 37 | -------------------------------------------------------------------------------- /graphdownstream/layers/readout.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | # Applies an average on seq, of shape (batch, nodes, features) 5 | # While taking into account the masking of msk 6 | class AvgReadout(nn.Module): 7 | def __init__(self): 8 | super(AvgReadout, self).__init__() 9 | 10 | def forward(self, seq, msk): 11 | if msk is None: 12 | return torch.mean(seq, 1) 13 | else: 14 | msk = torch.unsqueeze(msk, -1) 15 | return torch.sum(seq * msk, 1) / torch.sum(msk) 16 | 17 | -------------------------------------------------------------------------------- /graphdownstream/model_weight.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class model_weight(nn.Module): 5 | def __init__(self): 6 | super(model_weight, self).__init__() 7 | 8 | 9 | # self.weight_1 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 10 | # self.weight_2 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 11 | # self.weight_3 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 12 | # self.weight_4 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 13 | self.temp = torch.nn.Parameter(torch.Tensor(4,1),requires_grad=True) 14 | 15 | 16 | self.reset_parameters() 17 | 18 | def reset_parameters(self): 19 | 20 | # torch.nn.init.uniform_(self.weight_1, a=0.0, b=5.0) 21 | # torch.nn.init.uniform_(self.weight_2, a=0.0, b=5.0) 22 | # torch.nn.init.uniform_(self.weight_3, a=0.0, b=5.0) 23 | # torch.nn.init.uniform_(self.weight_4, a=0.0, b=5.0) 24 | torch.nn.init.uniform_(self.temp, a=0.0, b=0.1) 25 | def forward(self, graph_adj,weight_id): 26 | # temp = torch.Tensor([self.weight_1,self.weight_2,self.weight_3,self.weight_4]) 27 | temp = nn.functional.softmax(self.temp,dim =0) 28 | 29 | # size_ = graph_adj.size(0) 30 | # p = [i for i in range(size_)] 31 | # x = torch.tensor([p,p]) 32 | # q = [self.weight for i in range(size_)] 33 | # tt = torch.sparse_coo_tensor(x,q,(size_,size_)).to(graph_adj.device) 34 | # graph_adj = (graph_adj + tt) 35 | graph_adj = graph_adj.to(self.temp.device) 36 | if weight_id == 0: 37 | graph_adj = (graph_adj*(temp[0]+0.25)) 38 | elif weight_id == 1: 39 | graph_adj = (graph_adj*(temp[1]+0.25)) 40 | elif weight_id == 2: 41 | graph_adj = (graph_adj*(temp[2]+0.25)) 42 | else: 43 | graph_adj = (graph_adj*(temp[3]+0.25)) 44 | 45 | return graph_adj 46 | -------------------------------------------------------------------------------- /nodedownstream/ENZYMES2ONE_Graph.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.optim as optim 5 | import math 6 | import numpy as np 7 | import re 8 | import os 9 | import sys 10 | import json 11 | from torch.optim.lr_scheduler import LambdaLR 12 | from collections import OrderedDict 13 | from multiprocessing import Pool 14 | from tqdm import tqdm 15 | from sklearn.metrics import accuracy_score,f1_score,precision_score,recall_score 16 | import random 17 | from sklearn.metrics import precision_recall_fscore_support 18 | import functools 19 | import dgl 20 | 21 | 22 | def igraph_node_feature2dgl_node_feature(label,input): 23 | a=input 24 | b = a.split('[')[1] 25 | c = b.split(']')[0] 26 | d = c.split('\\n\', ') 27 | fix=d[len(d)-1] 28 | fix=fix.split('\\')[0] 29 | d[len(d)-1]=fix 30 | nodefeaturestring = [] 31 | for nodefeature in d: 32 | temp = nodefeature.split('\'')[1] 33 | temp = temp.split(',') 34 | nodefeaturestring.append(temp) 35 | 36 | numbers_float = [] # 转化为浮点数 37 | for num in nodefeaturestring: 38 | temp = [] 39 | for data in num: 40 | temp.append(float(data)) 41 | numbers_float.append(temp) 42 | return label,numbers_float 43 | 44 | def FUCK_U_IGraphLoad(path,graph_attr_num): 45 | with open(path, "r") as f: 46 | data=f.readlines() 47 | count=0 48 | for line in data: 49 | gattr=line.split() 50 | if gattr[0]=="label": 51 | label=int(gattr[1]) 52 | count+=1 53 | if gattr[0]=="feature": 54 | feature=line.split("feature")[1] 55 | count+=1 56 | if count==graph_attr_num: 57 | return label, feature 58 | 59 | def FUCK_IGraphLoad(path,graph_attr_num): 60 | label,feature=FUCK_U_IGraphLoad(path,graph_attr_num) 61 | return igraph_node_feature2dgl_node_feature(label,feature) 62 | 63 | def ReSetNodeId(startid,edgelist): 64 | count=0 65 | for edge in edgelist: 66 | src,dst=edge 67 | src+=startid 68 | dst+=startid 69 | edgelist[count]=(src,dst) 70 | count+=1 71 | return edgelist 72 | 73 | 74 | def _read_graphs_from_dir(dirpath): 75 | import igraph as ig 76 | graph = ig.Graph() 77 | count=0 78 | for filename in os.listdir(dirpath): 79 | if not os.path.isdir(os.path.join(dirpath, filename)): 80 | names = os.path.splitext(os.path.basename(filename)) 81 | if names[1] != ".gml": 82 | continue 83 | try: 84 | if count==0: 85 | _graph = ig.read(os.path.join(dirpath, filename)) 86 | label,feature=FUCK_IGraphLoad(os.path.join(dirpath, filename),2) 87 | _graph.vs["label"] = [int(x) for x in _graph.vs["label"]] 88 | _graph.es["label"] = [int(x) for x in _graph.es["label"]] 89 | _graph.es["key"] = [int(x) for x in _graph.es["key"]] 90 | _graph["feature"]=feature 91 | graph=_graph 92 | count+=1 93 | else: 94 | _graph = ig.read(os.path.join(dirpath, filename)) 95 | label,feature=FUCK_IGraphLoad(os.path.join(dirpath, filename),2) 96 | _graph.vs["label"] = [int(x) for x in _graph.vs["label"]] 97 | _graph.es["label"] = [int(x) for x in _graph.es["label"]] 98 | _graph.es["key"] = [int(x) for x in _graph.es["key"]] 99 | _graph["feature"]=feature 100 | _graph_nodelabel=_graph.vs["label"] 101 | graph_nodelabel=graph.vs["label"] 102 | new_nodelabel=graph_nodelabel+_graph_nodelabel 103 | _graph_edgelabel=_graph.es["label"] 104 | graph_edgelabel=graph.es["label"] 105 | new_edgelabel=graph_edgelabel+_graph_edgelabel 106 | _graph_edgekey=_graph.es["key"] 107 | graph_edgekey=graph.es["key"] 108 | new_edgekey=graph_edgekey+_graph_edgekey 109 | 110 | graph_nodenum=graph.vcount() 111 | _graph_nodenum=_graph.vcount() 112 | graph.add_vertices(_graph_nodenum) 113 | _graphedge=_graph.get_edgelist() 114 | _graphedge=ReSetNodeId(graph_nodenum,_graphedge) 115 | graph.add_edges(_graphedge) 116 | graph.vs["label"]=new_nodelabel 117 | graph.es["label"]=new_edgelabel 118 | graph.es["key"]=new_edgekey 119 | graph["feature"]=graph["feature"]+_graph["feature"] 120 | 121 | except BaseException as e: 122 | print(e) 123 | break 124 | return graph 125 | 126 | def graph2dglgraph(graph): 127 | dglgraph = dgl.DGLGraph(multigraph=True) 128 | dglgraph.add_nodes(graph.vcount()) 129 | edges = graph.get_edgelist() 130 | dglgraph.add_edges([e[0] for e in edges], [e[1] for e in edges]) 131 | dglgraph.readonly(True) 132 | return dglgraph 133 | 134 | def dglpreprocess(x): 135 | graph = x 136 | graph_dglgraph = graph2dglgraph(graph) 137 | '''graph_dglgraph.ndata["indeg"] = np.array(graph.indegree(), dtype=np.float32) 138 | graph_dglgraph.ndata["label"] = np.array(graph.vs["label"], dtype=np.int64) 139 | graph_dglgraph.ndata["id"] = np.arange(0, graph.vcount(), dtype=np.int64)''' 140 | # graph_dglgraph.edata["label"] = np.array(graph.es["label"], dtype=np.int64) 141 | graph_dglgraph.ndata["indeg"] = torch.tensor(np.array(graph.indegree(), dtype=np.float32)) 142 | graph_dglgraph.ndata["label"] = torch.tensor(np.array(graph.vs["label"], dtype=np.int64)) 143 | graph_dglgraph.ndata["id"] = torch.tensor(np.arange(0, graph.vcount(), dtype=np.int64)) 144 | nodefeature=graph["feature"] 145 | graph_dglgraph.ndata["feature"]=torch.tensor(np.array(nodefeature, dtype=np.float32)) 146 | return graph_dglgraph 147 | 148 | def read_graphs_from_dir(dirpath): 149 | import igraph as ig 150 | ret=[] 151 | for filename in os.listdir(dirpath): 152 | if not os.path.isdir(os.path.join(dirpath, filename)): 153 | names = os.path.splitext(os.path.basename(filename)) 154 | if names[1] != ".gml": 155 | continue 156 | try: 157 | graph = ig.read(os.path.join(dirpath, filename)) 158 | label,feature=FUCK_IGraphLoad(os.path.join(dirpath, filename),2) 159 | graph.vs["label"] = [int(x) for x in graph.vs["label"]] 160 | graph.es["label"] = [int(x) for x in graph.es["label"]] 161 | graph.es["key"] = [int(x) for x in graph.es["key"]] 162 | graph["label"]=label 163 | graph["feature"]=feature 164 | ret.append(graph) 165 | except BaseException as e: 166 | print(e) 167 | break 168 | return ret 169 | 170 | 171 | 172 | if __name__ == "__main__": 173 | assert len(sys.argv) == 2 174 | nci1_data_path = sys.argv[1] 175 | #single graph 176 | #save_path="../data/ENZYMES/nodetaskinput" 177 | save_path="../data/ENZYMES/test_allinone" 178 | graph=_read_graphs_from_dir(nci1_data_path) 179 | dglgraph=dglpreprocess(graph) 180 | dgl.data.utils.save_graphs(os.path.join(save_path,"graph"),dglgraph) 181 | g=dgl.load_graphs(os.path.join(save_path,"graph"))[0][0] 182 | print(g) 183 | #注意这里g[0][0]才是最终处理好的dgl图 184 | print(g.number_of_nodes()) 185 | 186 | def Raw2OneGraph(raw_data,save_data): 187 | nci1_data_path = raw_data 188 | #single graph 189 | #save_path="../data/ENZYMES/nodetaskinput" 190 | save_path=save_data 191 | graphs=read_graphs_from_dir(nci1_data_path) 192 | count=0 193 | for graph in graphs: 194 | print("process graph ",count) 195 | dglgraph=dglpreprocess(graph) 196 | if countlabelnum(dglgraph)!=1: 197 | dgl.data.utils.save_graphs(os.path.join(save_path,str(count)),dglgraph) 198 | count+=1 199 | return count 200 | 201 | def countlabelnum(graph): 202 | count=torch.zeros(3) 203 | for i in graph.ndata["label"]: 204 | count[i]=1 205 | return count.count_nonzero() -------------------------------------------------------------------------------- /nodedownstream/aug.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import copy 3 | import random 4 | import pdb 5 | import scipy.sparse as sp 6 | import numpy as np 7 | 8 | def main(): 9 | pass 10 | 11 | 12 | def aug_random_mask(input_feature, drop_percent=0.2): 13 | 14 | node_num = input_feature.shape[1] 15 | mask_num = int(node_num * drop_percent) 16 | node_idx = [i for i in range(node_num)] 17 | mask_idx = random.sample(node_idx, mask_num) 18 | aug_feature = copy.deepcopy(input_feature) 19 | zeros = torch.zeros_like(aug_feature[0][0]) 20 | for j in mask_idx: 21 | aug_feature[0][j] = zeros 22 | return aug_feature 23 | 24 | 25 | def aug_random_edge(input_adj, drop_percent=0.2): 26 | 27 | percent = drop_percent / 2 28 | row_idx, col_idx = input_adj.nonzero() 29 | 30 | index_list = [] 31 | for i in range(len(row_idx)): 32 | index_list.append((row_idx[i], col_idx[i])) 33 | 34 | single_index_list = [] 35 | for i in list(index_list): 36 | single_index_list.append(i) 37 | index_list.remove((i[1], i[0])) 38 | 39 | 40 | edge_num = int(len(row_idx) / 2) # 9228 / 2 41 | add_drop_num = int(edge_num * percent / 2) 42 | aug_adj = copy.deepcopy(input_adj.todense().tolist()) 43 | 44 | edge_idx = [i for i in range(edge_num)] 45 | drop_idx = random.sample(edge_idx, add_drop_num) 46 | 47 | 48 | for i in drop_idx: 49 | aug_adj[single_index_list[i][0]][single_index_list[i][1]] = 0 50 | aug_adj[single_index_list[i][1]][single_index_list[i][0]] = 0 51 | 52 | ''' 53 | above finish drop edges 54 | ''' 55 | node_num = input_adj.shape[0] 56 | l = [(i, j) for i in range(node_num) for j in range(i)] 57 | add_list = random.sample(l, add_drop_num) 58 | 59 | for i in add_list: 60 | 61 | aug_adj[i[0]][i[1]] = 1 62 | aug_adj[i[1]][i[0]] = 1 63 | 64 | aug_adj = np.matrix(aug_adj) 65 | aug_adj = sp.csr_matrix(aug_adj) 66 | return aug_adj 67 | 68 | 69 | def aug_drop_node(input_fea, input_adj, drop_percent=0.2): 70 | 71 | input_adj = torch.tensor(input_adj.todense().tolist()) 72 | input_fea = input_fea.squeeze(0) 73 | 74 | node_num = input_fea.shape[0] 75 | drop_num = int(node_num * drop_percent) # number of drop nodes 76 | all_node_list = [i for i in range(node_num)] 77 | 78 | drop_node_list = sorted(random.sample(all_node_list, drop_num)) 79 | 80 | aug_input_fea = delete_row_col(input_fea, drop_node_list, only_row=True) 81 | aug_input_adj = delete_row_col(input_adj, drop_node_list) 82 | 83 | aug_input_fea = aug_input_fea.unsqueeze(0) 84 | aug_input_adj = sp.csr_matrix(np.matrix(aug_input_adj)) 85 | 86 | return aug_input_fea, aug_input_adj 87 | 88 | def aug_subgraph_CL(graph, drop_percent=0.2): 89 | # input_adj = graph.adjacency_matrix().to_dense() 90 | # input_fea = input_fea.squeeze(0) 91 | edge_num = graph.batch_num_edges().tolist() 92 | # all_edge_list = [i for i in range(edge_num)] 93 | # s_node_num = int(edge_num * (1 - drop_percent)) 94 | # center_node_id = random.randint(0, node_num - 1) 95 | # sub_node_id_list = [center_node_id] 96 | # all_neighbor_list = [] 97 | # for i in range(s_node_num - 1): 98 | 99 | # all_neighbor_list += torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False).squeeze(1).tolist() 100 | # # print(torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False)) 101 | # all_neighbor_list = list(set(all_neighbor_list)) 102 | # new_neighbor_list = [n for n in all_neighbor_list if not n in sub_node_id_list] 103 | # if len(new_neighbor_list) != 0: 104 | # new_node = random.sample(new_neighbor_list, 1)[0] 105 | # sub_node_id_list.append(new_node) 106 | # else: 107 | # break 108 | 109 | 110 | # print("hhhhhhh") 111 | # print(drop_node_list) 112 | # a = graph_len.squeeze(1).tolist() 113 | sub_edge_id_list = [] 114 | tag = 0 115 | for i in range(len(edge_num)): 116 | s_edge_num = int(edge_num[i] * drop_percent) 117 | temp = random.sample(range(0,edge_num[i]),s_edge_num) 118 | sub_edge_id_list += [(x+tag) for x in temp] 119 | tag+=edge_num[i] 120 | # edge_num[i] = edge_num[i]-s_node_num 121 | 122 | drop_edge_list = sub_edge_id_list 123 | 124 | # a = torch.IntTensor(a).unsqueeze(1) 125 | 126 | 127 | 128 | graph.remove_edges(drop_edge_list) 129 | 130 | # return graph.subgraph(sub_node_id_list),a 131 | return graph 132 | 133 | return graph,graph_len 134 | def aug_subgraph(graph,graph_len, drop_percent=0.2): 135 | 136 | # input_adj = graph.adjacency_matrix().to_dense() 137 | # input_fea = input_fea.squeeze(0) 138 | node_num = graph.ndata['feature'].shape[0] 139 | # all_node_list = [i for i in range(node_num)] 140 | # s_node_num = int(node_num * (1 - drop_percent)) 141 | # center_node_id = random.randint(0, node_num - 1) 142 | # sub_node_id_list = [center_node_id] 143 | # all_neighbor_list = [] 144 | # for i in range(s_node_num - 1): 145 | 146 | # all_neighbor_list += torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False).squeeze(1).tolist() 147 | # # print(torch.nonzero(input_adj[sub_node_id_list[i]], as_tuple=False)) 148 | # all_neighbor_list = list(set(all_neighbor_list)) 149 | # new_neighbor_list = [n for n in all_neighbor_list if not n in sub_node_id_list] 150 | # if len(new_neighbor_list) != 0: 151 | # new_node = random.sample(new_neighbor_list, 1)[0] 152 | # sub_node_id_list.append(new_node) 153 | # else: 154 | # break 155 | 156 | 157 | # print("hhhhhhh") 158 | # print(drop_node_list) 159 | a = graph_len.squeeze(1).tolist() 160 | sub_node_id_list = [] 161 | tag = 0 162 | for i in range(len(a)): 163 | s_node_num = int(a[i] * drop_percent) 164 | 165 | 166 | temp = random.sample(range(0,a[i]),s_node_num) 167 | sub_node_id_list += [(x+tag) for x in temp] 168 | tag+=a[i] 169 | a[i] = a[i]-s_node_num 170 | 171 | drop_node_list = sub_node_id_list 172 | 173 | a = torch.IntTensor(a).unsqueeze(1) 174 | 175 | 176 | 177 | graph.remove_edges(drop_node_list) 178 | 179 | # return graph.subgraph(sub_node_id_list),a 180 | return graph,a 181 | 182 | #用于Flickr数据集在DGI方法下删除结点 183 | 184 | def aug_subgraph_F(graph, drop_percent=0.2): 185 | 186 | node_num = graph.num_nodes() 187 | # print("hhhhhhh") 188 | # print(drop_node_list) 189 | 190 | 191 | 192 | s_node_num = int(node_num * drop_percent) 193 | 194 | 195 | drop_node_list = random.sample(range(0,node_num),s_node_num) 196 | 197 | 198 | 199 | node_num = node_num - s_node_num 200 | 201 | try: 202 | graph.remove_nodes(drop_node_list) 203 | except: 204 | pass 205 | 206 | # return graph.subgraph(sub_node_id_list),a 207 | return graph,node_num 208 | 209 | 210 | 211 | 212 | 213 | def delete_row_col(input_matrix, drop_list, only_row=False): 214 | 215 | remain_list = [i for i in range(input_matrix.shape[0]) if i not in drop_list] 216 | out = input_matrix[remain_list, :] 217 | if only_row: 218 | return out 219 | out = out[:, remain_list] 220 | 221 | return out 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | 237 | 238 | 239 | 240 | 241 | if __name__ == "__main__": 242 | main() 243 | 244 | -------------------------------------------------------------------------------- /nodedownstream/datasetInfo.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import torch.optim as optim 5 | import math 6 | import numpy as np 7 | import re 8 | import os 9 | import sys 10 | import json 11 | from torch.optim.lr_scheduler import LambdaLR 12 | from collections import OrderedDict 13 | from multiprocessing import Pool 14 | from tqdm import tqdm 15 | from sklearn.metrics import accuracy_score,f1_score,precision_score,recall_score 16 | import random 17 | from tqdm import trange 18 | from sklearn.metrics import precision_recall_fscore_support 19 | import functools 20 | import dgl 21 | 22 | #drop==True means drop nodes of class drop when split train,val, test;but can only drop the biggest class(ex 0,1,2 can only drop label 2) 23 | def few_shot_split_nodelevel(graph,tasknum,trainshot,valshot,labelnum,seed=0, drop=False): 24 | train=[] 25 | val=[] 26 | test=[] 27 | if drop: 28 | labelnum=labelnum-1 29 | nodenum=graph.number_of_nodes() 30 | random.seed(seed) 31 | for count in range(tasknum): 32 | index = random.sample(range(0, nodenum), nodenum) 33 | trainindex=[] 34 | valindex=[] 35 | testindex=[] 36 | traincount = torch.zeros(labelnum) 37 | valcount = torch.zeros(labelnum) 38 | for i in index: 39 | label=graph.ndata["label"][i] 40 | if drop: 41 | if label==labelnum: 42 | continue 43 | if traincount[label] 0 and (len(batches[batch_idx]) == self.batch_size or not self.drop_last): 67 | yield batches[batch_idx] 68 | 69 | def __len__(self): 70 | if self.drop_last: 71 | return math.floor(self.data_size / self.batch_size) 72 | else: 73 | return math.ceil(self.data_size / self.batch_size) 74 | 75 | 76 | ############################################## 77 | ############# EdgeSeq Data Part ############## 78 | ############################################## 79 | class EdgeSeq: 80 | def __init__(self, code): 81 | self.u = code[:, 0] 82 | self.v = code[:, 1] 83 | self.ul = code[:, 2] 84 | self.el = code[:, 3] 85 | self.vl = code[:, 4] 86 | 87 | def __len__(self): 88 | if len(self.u.shape) == 2: # single code 89 | return self.u.shape[0] 90 | else: # batch code 91 | return self.u.shape[0] * self.u.shape[1] 92 | 93 | @staticmethod 94 | def batch(data): 95 | b = EdgeSeq(torch.empty((0, 5), dtype=torch.long)) 96 | b.u = batch_convert_tensor_to_tensor([x.u for x in data]) 97 | b.v = batch_convert_tensor_to_tensor([x.v for x in data]) 98 | b.ul = batch_convert_tensor_to_tensor([x.ul for x in data]) 99 | b.el = batch_convert_tensor_to_tensor([x.el for x in data]) 100 | b.vl = batch_convert_tensor_to_tensor([x.vl for x in data]) 101 | return b 102 | 103 | def to(self, device): 104 | self.u = self.u.to(device) 105 | self.v = self.v.to(device) 106 | self.ul = self.ul.to(device) 107 | self.el = self.el.to(device) 108 | self.vl = self.vl.to(device) 109 | 110 | 111 | ############################################## 112 | ############# EdgeSeq Data Part ############## 113 | ############################################## 114 | class EdgeSeqDataset(data.Dataset): 115 | def __init__(self, data=None): 116 | super(EdgeSeqDataset, self).__init__() 117 | 118 | if data: 119 | self.data = EdgeSeqDataset.preprocess_batch(data, use_tqdm=True) 120 | else: 121 | self.data = list() 122 | self._to_tensor() 123 | 124 | def _to_tensor(self): 125 | for x in self.data: 126 | for k in ["pattern", "graph", "subisomorphisms"]: 127 | if isinstance(x[k], np.ndarray): 128 | x[k] = torch.from_numpy(x[k]) 129 | 130 | def __len__(self): 131 | return len(self.data) 132 | 133 | def __getitem__(self, idx): 134 | return self.data[idx] 135 | 136 | def save(self, filename): 137 | cache = defaultdict(list) 138 | for x in self.data: 139 | for k in list(x.keys()): 140 | if k.startswith("_"): 141 | cache[k].append(x.pop(k)) 142 | with open(filename, "wb") as f: 143 | torch.save(self.data, f, pickle_protocol=pickle.HIGHEST_PROTOCOL) 144 | if len(cache) > 0: 145 | keys = cache.keys() 146 | for i in range(len(self.data)): 147 | for k in keys: 148 | self.data[i][k] = cache[k][i] 149 | 150 | def load(self): 151 | self.data = FlickrDataset()[0] 152 | save_path="../data/Flickr/allinone/graph" 153 | if os.path.exists(save_path)==False: 154 | dgl.data.utils.save_graphs(os.path.join(save_path,"graph"),self.data) 155 | return self 156 | 157 | @staticmethod 158 | def graph2edgeseq(graph): 159 | labels = graph.vs["label"] 160 | graph_code = list() 161 | 162 | for edge in graph.es: 163 | v, u = edge.tuple 164 | graph_code.append((v, u, labels[v], edge["label"], labels[u])) 165 | graph_code = np.array(graph_code, dtype=np.int64) 166 | graph_code.view( 167 | [("v", "int64"), ("u", "int64"), ("vl", "int64"), ("el", "int64"), ("ul", "int64")]).sort( 168 | axis=0, order=["v", "u", "el"]) 169 | return graph_code 170 | 171 | @staticmethod 172 | def preprocess(x): 173 | pattern_code = EdgeSeqDataset.graph2edgeseq(x["pattern"]) 174 | graph_code = EdgeSeqDataset.graph2edgeseq(x["graph"]) 175 | subisomorphisms = np.array(x["subisomorphisms"], dtype=np.int32).reshape(-1, x["pattern"].vcount()) 176 | 177 | x = { 178 | "id": x["id"], 179 | "pattern": pattern_code, 180 | "graph": graph_code, 181 | "counts": x["counts"], 182 | "subisomorphisms": subisomorphisms} 183 | return x 184 | 185 | @staticmethod 186 | def preprocess_batch(data, use_tqdm=False): 187 | d = list() 188 | if use_tqdm: 189 | data = tqdm(data) 190 | for x in data: 191 | d.append(EdgeSeqDataset.preprocess(x)) 192 | return d 193 | 194 | @staticmethod 195 | def batchify(batch): 196 | _id = [x["id"] for x in batch] 197 | pattern = EdgeSeq.batch([EdgeSeq(x["pattern"]) for x in batch]) 198 | pattern_len = torch.tensor([x["pattern"].shape[0] for x in batch], dtype=torch.int32).view(-1, 1) 199 | graph = EdgeSeq.batch([EdgeSeq(x["graph"]) for x in batch]) 200 | graph_len = torch.tensor([x["graph"].shape[0] for x in batch], dtype=torch.int32).view(-1, 1) 201 | counts = torch.tensor([x["counts"] for x in batch], dtype=torch.float32).view(-1, 1) 202 | return _id, pattern, pattern_len, graph, graph_len, counts 203 | 204 | 205 | ############################################## 206 | ######### GraphAdj Data Part ########### 207 | ############################################## 208 | class GraphAdjDataset_DGL_Input(data.Dataset): 209 | def __init__(self, data=None): 210 | super(GraphAdjDataset_DGL_Input, self).__init__() 211 | 212 | self.data = GraphAdjDataset_DGL_Input.preprocess_batch(data, use_tqdm=True) 213 | # self._to_tensor() 214 | 215 | def _to_tensor(self): 216 | for x in self.data: 217 | for k in ["graph"]: 218 | y = x[k] 219 | for k, v in y.ndata.items(): 220 | if isinstance(v, np.ndarray): 221 | y.ndata[k] = torch.from_numpy(v) 222 | for k, v in y.edata.items(): 223 | if isinstance(v, np.ndarray): 224 | y.edata[k] = torch.from_numpy(v) 225 | if isinstance(x["subisomorphisms"], np.ndarray): 226 | x["subisomorphisms"] = torch.from_numpy(x["subisomorphisms"]) 227 | 228 | def __len__(self): 229 | return len(self.data) 230 | 231 | def __getitem__(self, idx): 232 | return self.data[idx] 233 | 234 | def save(self, filename): 235 | cache = defaultdict(list) 236 | for x in self.data: 237 | for k in list(x.keys()): 238 | if k.startswith("_"): 239 | cache[k].append(x.pop(k)) 240 | with open(filename, "wb") as f: 241 | torch.save(self.data, f, pickle_protocol=pickle.HIGHEST_PROTOCOL) 242 | if len(cache) > 0: 243 | keys = cache.keys() 244 | for i in range(len(self.data)): 245 | for k in keys: 246 | self.data[i][k] = cache[k][i] 247 | 248 | def load(self): 249 | self.data = FlickrDataset()[0] 250 | print(self.data) 251 | save_path="../data/Flickr/allinone/graph" 252 | if os.path.exists(save_path)==False: 253 | dgl.data.utils.save_graphs(os.path.join(save_path,"graph"),self.data) 254 | return self 255 | 256 | @staticmethod 257 | def comp_indeg_norm(graph): 258 | import igraph as ig 259 | if isinstance(graph, ig.Graph): 260 | # 10x faster 261 | in_deg = np.array(graph.indegree(), dtype=np.float32) 262 | elif isinstance(graph, dgl.DGLGraph): 263 | in_deg = graph.in_degrees(range(graph.number_of_nodes())).float().numpy() 264 | else: 265 | raise NotImplementedError 266 | norm = 1.0 / in_deg 267 | norm[np.isinf(norm)] = 0 268 | return norm 269 | 270 | @staticmethod 271 | def graph2dglgraph(graph): 272 | dglgraph = dgl.DGLGraph(multigraph=True) 273 | dglgraph.add_nodes(graph.vcount()) 274 | edges = graph.get_edgelist() 275 | dglgraph.add_edges([e[0] for e in edges], [e[1] for e in edges]) 276 | dglgraph.readonly(True) 277 | return dglgraph 278 | 279 | @staticmethod 280 | # 打乱了遍历所有节点找到不相邻节点的顺序,而不是像8.5前的代码一样直接按序遍历 281 | # 从而应当可以提高预训练模型的效果 282 | # 同时这里还可以考虑增加要找到的负样本数量,应当会有更好的预训练效果 283 | def find_no_connection_node(graph, node): 284 | numnode = graph.number_of_nodes() 285 | rand = list(range(numnode)) 286 | random.shuffle(rand) 287 | for i in range(numnode): 288 | if graph.has_edges_between(node, rand[i]): 289 | continue 290 | else: 291 | return i 292 | 293 | @staticmethod 294 | def findsample(graph): 295 | nodenum = graph.number_of_nodes() 296 | result = torch.ones(nodenum, 3) 297 | adj = graph.adjacency_matrix() 298 | src = adj._indices()[1].tolist() 299 | dst = adj._indices()[0].tolist() 300 | # ----------------------------------------------------------------------------------------- 301 | # 这里的处理方式针对所有节点皆有邻居的情况且皆以之为起点,且存在不与之相连的节点,典型的数据为双向图且无孤立节点, 302 | # 对于其他类型数据则不适用,需要考虑无邻居节点或无不相连节点要怎么处理 303 | # 一个处理方式就是在图上选择出符合要求的节点来构建图 304 | for i in range(nodenum): 305 | result[i, 0] = i 306 | # NCI1存在着某些节点是孤立的情况,这里将孤立节点的正样本设为其自身 307 | if i not in src: 308 | result[i, 1] = i 309 | else: 310 | index_i = src.index(i) 311 | i_point_to = dst[index_i] 312 | result[i, 1] = i_point_to 313 | result[i, 2] = GraphAdjDataset.find_no_connection_node(graph, i) 314 | # ------------------------------------------------------------------------------------------- 315 | return torch.tensor(result, dtype=int) 316 | 317 | @staticmethod 318 | def preprocess(x): 319 | graph = x["graph"] 320 | '''graph_dglgraph = GraphAdjDataset.graph2dglgraph(graph) 321 | graph_dglgraph.ndata["indeg"] = torch.tensor(np.array(graph.indegree(), dtype=np.float32)) 322 | graph_dglgraph.ndata["label"] = torch.tensor(np.array(graph.vs["label"], dtype=np.int64)) 323 | graph_dglgraph.ndata["id"] = torch.tensor(np.arange(0, graph.vcount(), dtype=np.int64)) 324 | graph_dglgraph.ndata["sample"] = GraphAdjDataset.findsample(graph_dglgraph)''' 325 | x = { 326 | "id": x["id"], 327 | "graph": graph, 328 | "label": x["label"]} 329 | return x 330 | 331 | @staticmethod 332 | def preprocess_batch(data, use_tqdm=False): 333 | d = list() 334 | if use_tqdm: 335 | data = tqdm(data) 336 | for x in data: 337 | d.append(GraphAdjDataset_DGL_Input.preprocess(x)) 338 | return d 339 | 340 | @staticmethod 341 | def batchify(batch): 342 | _id = [x["id"] for x in batch] 343 | graph_label = torch.tensor([x["label"] for x in batch], dtype=torch.float64).view(-1, 1) 344 | graph = dgl.batch([x["graph"] for x in batch]) 345 | graph_len = torch.tensor([x["graph"].number_of_nodes() for x in batch], dtype=torch.int32).view(-1, 1) 346 | return _id, graph_label, graph, graph_len 347 | 348 | 349 | class GraphAdjDataset(data.Dataset): 350 | def __init__(self, data=None): 351 | super(GraphAdjDataset, self).__init__() 352 | 353 | if data: 354 | self.data = GraphAdjDataset.preprocess_batch(data, use_tqdm=True) 355 | else: 356 | self.data = list() 357 | # self._to_tensor() 358 | 359 | def _to_tensor(self): 360 | for x in self.data: 361 | for k in ["graph"]: 362 | y = x[k] 363 | for k, v in y.ndata.items(): 364 | if isinstance(v, np.ndarray): 365 | y.ndata[k] = torch.from_numpy(v) 366 | for k, v in y.edata.items(): 367 | if isinstance(v, np.ndarray): 368 | y.edata[k] = torch.from_numpy(v) 369 | if isinstance(x["subisomorphisms"], np.ndarray): 370 | x["subisomorphisms"] = torch.from_numpy(x["subisomorphisms"]) 371 | 372 | def __len__(self): 373 | # return len(self.data) 374 | return 1 375 | 376 | def __getitem__(self, idx): 377 | return self.data[idx] 378 | 379 | def save(self, filename): 380 | cache = defaultdict(list) 381 | for x in self.data: 382 | for k in list(x.keys()): 383 | if k.startswith("_"): 384 | cache[k].append(x.pop(k)) 385 | with open(filename, "wb") as f: 386 | torch.save(self.data, f, pickle_protocol=pickle.HIGHEST_PROTOCOL) 387 | if len(cache) > 0: 388 | keys = cache.keys() 389 | for i in range(len(self.data)): 390 | for k in keys: 391 | self.data[i][k] = cache[k][i] 392 | 393 | def load(self): 394 | self.data = FlickrDataset()[0] 395 | print(self.data) 396 | save_path="../data/Flickr/allinone/graph" 397 | if os.path.exists(save_path)==False: 398 | dgl.data.utils.save_graphs(os.path.join(save_path,"graph"),self.data) 399 | return self 400 | 401 | @staticmethod 402 | def comp_indeg_norm(graph): 403 | import igraph as ig 404 | if isinstance(graph, ig.Graph): 405 | # 10x faster 406 | in_deg = np.array(graph.indegree(), dtype=np.float32) 407 | elif isinstance(graph, dgl.DGLGraph): 408 | in_deg = graph.in_degrees(range(graph.number_of_nodes())).float().numpy() 409 | else: 410 | raise NotImplementedError 411 | norm = 1.0 / in_deg 412 | norm[np.isinf(norm)] = 0 413 | return norm 414 | 415 | @staticmethod 416 | def graph2dglgraph(graph): 417 | dglgraph = dgl.DGLGraph(multigraph=True) 418 | dglgraph.add_nodes(graph.vcount()) 419 | edges = graph.get_edgelist() 420 | dglgraph.add_edges([e[0] for e in edges], [e[1] for e in edges]) 421 | dglgraph.readonly(True) 422 | return dglgraph 423 | 424 | @staticmethod 425 | # 打乱了遍历所有节点找到不相邻节点的顺序,而不是像8.5前的代码一样直接按序遍历 426 | # 从而应当可以提高预训练模型的效果 427 | # 同时这里还可以考虑增加要找到的负样本数量,应当会有更好的预训练效果 428 | def find_no_connection_node(graph, node): 429 | numnode = graph.number_of_nodes() 430 | rand = list(range(numnode)) 431 | random.shuffle(rand) 432 | for i in range(numnode): 433 | if graph.has_edges_between(node, rand[i]): 434 | continue 435 | else: 436 | return i 437 | 438 | @staticmethod 439 | def findsample(graph): 440 | nodenum = graph.number_of_nodes() 441 | result = torch.ones(nodenum, 3) 442 | adj = graph.adjacency_matrix() 443 | # 当前版本的dgl的adj indeices将src和dst改成符合直觉的顺序了 444 | '''src = adj._indices()[1].tolist() 445 | dst = adj._indices()[0].tolist()''' 446 | src = adj._indices()[0].tolist() 447 | dst = adj._indices()[1].tolist() 448 | 449 | # ----------------------------------------------------------------------------------------- 450 | # 这里的处理方式针对所有节点皆有邻居的情况且皆以之为起点,且存在不与之相连的节点,典型的数据为双向图且无孤立节点, 451 | # 对于其他类型数据则不适用,需要考虑无邻居节点或无不相连节点要怎么处理 452 | # 一个处理方式就是在图上选择出符合要求的节点来构建图 453 | for i in range(nodenum): 454 | result[i, 0] = i 455 | # NCI1存在着某些节点是孤立的情况,这里将孤立节点的正样本设为其自身 456 | if i not in src: 457 | result[i, 1] = i 458 | else: 459 | index_i = src.index(i) 460 | i_point_to = dst[index_i] 461 | result[i, 1] = i_point_to 462 | result[i, 2] = GraphAdjDataset.find_no_connection_node(graph, i) 463 | # ------------------------------------------------------------------------------------------- 464 | return torch.tensor(result, dtype=int) 465 | 466 | # @staticmethod 467 | # def igraph_node_feature2dgl_node_feature(input): 468 | # a = input 469 | # b = a.split('[')[1] 470 | # c = b.split(']')[0] 471 | # d = c.split('\', ') 472 | # nodefeaturestring = [] 473 | # for nodefeature in d: 474 | # temp = nodefeature.split('\'')[1] 475 | # temp = temp.split(',') 476 | # nodefeaturestring.append(temp) 477 | # 478 | # numbers_float = [] # 转化为浮点数 479 | # for num in nodefeaturestring: 480 | # temp = [] 481 | # for data in num: 482 | # temp.append(float(data)) 483 | # numbers_float.append(temp) 484 | # return numbers_float 485 | 486 | @staticmethod 487 | def preprocess(input): 488 | x=input.to_homogeneous() 489 | x.ndata["feature"]=input.ndata["feat"] 490 | x = { 491 | "id": "0", 492 | "graph": x, 493 | "label": 0} 494 | return x 495 | 496 | @staticmethod 497 | def preprocess_batch(data, use_tqdm=False): 498 | d = list() 499 | if use_tqdm: 500 | data = tqdm(data) 501 | for x in data: 502 | d.append(GraphAdjDataset.preprocess(x)) 503 | return d 504 | 505 | @staticmethod 506 | def batchify(batch): 507 | _id = [x["id"] for x in batch] 508 | graph_label = torch.tensor([x["label"] for x in batch], dtype=torch.float64).view(-1, 1) 509 | graph = dgl.batch([x["graph"] for x in batch]) 510 | graph_len = torch.tensor([x["graph"].number_of_nodes() for x in batch], dtype=torch.int32).view(-1, 1) 511 | return _id, graph_label, graph, graph_len 512 | -------------------------------------------------------------------------------- /nodedownstream/dgi.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import AvgReadout, Discriminator, Discriminator2 4 | import pdb 5 | from gin_local import GIN 6 | from utils import split_and_batchify_graph_feats 7 | class DGI(nn.Module): 8 | def __init__(self, n_in, n_h,config): 9 | super(DGI, self).__init__() 10 | self.read = AvgReadout() 11 | self.sigm = nn.Sigmoid() 12 | self.disc = Discriminator(n_h*3) 13 | # self.disc2 = Discriminator2(n_h) 14 | self.gin = GIN(config) 15 | self.config = config 16 | 17 | 18 | def forward(self, graph,graph_shuf, graph_1, graph_2, graph_len, graph_len_1, graph_len_2, sparse, msk, samp_bias1, samp_bias2, aug_type): 19 | 20 | c_0,h_0 = self.gin(graph, graph_len) 21 | 22 | if aug_type == 'edge': 23 | 24 | # h_1 = self.gcn(seq1, aug_adj1, sparse) 25 | # h_3 = self.gcn(seq1, aug_adj2, sparse) 26 | pass 27 | 28 | elif aug_type == 'mask': 29 | 30 | # h_1 = self.gcn(seq3, adj, sparse) 31 | # h_3 = self.gcn(seq4, adj, sparse) 32 | pass 33 | 34 | elif aug_type == 'node' or aug_type == 'subgraph': 35 | 36 | c_1,h_1 = self.gin(graph_1, graph_len_1) 37 | c_3,h_3 = self.gin(graph_2, graph_len_2) 38 | 39 | else: 40 | assert False 41 | 42 | 43 | c_2,h_2 = self.gin(graph_shuf,graph_len) 44 | h_0,h_2 = self.sigm(h_0),self.sigm(h_2) 45 | 46 | 47 | # len_1 = int(h_0.shape[1]) 48 | # len_2 = int(h_2.shape[1]) 49 | # ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) 50 | # ret2 = self.disc(c_3, h_0, h_2, samp_bias1, samp_bias2) 51 | device = self.config["gpu_id"] 52 | # ret = ret1 + ret2 53 | # return ret,int(h_0.shape[1]),int(h_2.shape[1]) 54 | graph.add_self_loop() 55 | graph_1.add_self_loop() 56 | graph_2.add_self_loop() 57 | adj = graph.adjacency_matrix() 58 | adj_1 = graph_1.adjacency_matrix() 59 | adj_2 = graph_2.adjacency_matrix() 60 | adj = adj.to(device) 61 | adj_1 = adj_1.to(device) 62 | adj_2 = adj_2.to(device) 63 | for count in range(self.config["pretrain_hop_num"]): 64 | h_0 = torch.matmul(adj, h_0) 65 | # h_1 = torch.matmul(adj_1, h_1) 66 | h_2 = torch.matmul(adj, h_2) 67 | # h_3 = torch.matmul(adj_2, h_3) 68 | return h_0,c_1,h_2,c_3 69 | 70 | # Detach the return variables 71 | def embed(self, graph, graph_len): 72 | return self.gin(graph,graph_len) 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /nodedownstream/dgi_f.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import AvgReadout, Discriminator, Discriminator2 4 | import pdb 5 | from gin_flickr_DGI import GIN 6 | from utils import split_and_batchify_graph_feats 7 | class DGI(nn.Module): 8 | def __init__(self, n_in, n_h,config): 9 | super(DGI, self).__init__() 10 | self.read = AvgReadout() 11 | self.sigm = nn.Sigmoid() 12 | self.disc = Discriminator(n_h*3) 13 | # self.disc2 = Discriminator2(n_h) 14 | self.gin = GIN(config) 15 | self.config = config 16 | 17 | 18 | def forward(self, graph,graph_shuf, graph_1, graph_2, graph_len, graph_len_1, graph_len_2, sparse, msk, samp_bias1, samp_bias2, aug_type): 19 | 20 | c_0,h_0 = self.gin(graph, graph_len) 21 | 22 | if aug_type == 'edge': 23 | 24 | # h_1 = self.gcn(seq1, aug_adj1, sparse) 25 | # h_3 = self.gcn(seq1, aug_adj2, sparse) 26 | pass 27 | 28 | elif aug_type == 'mask': 29 | 30 | # h_1 = self.gcn(seq3, adj, sparse) 31 | # h_3 = self.gcn(seq4, adj, sparse) 32 | pass 33 | 34 | elif aug_type == 'node' or aug_type == 'subgraph': 35 | 36 | c_1,h_1 = self.gin(graph_1, graph_len_1) 37 | c_3,h_3 = self.gin(graph_2, graph_len_2) 38 | 39 | else: 40 | assert False 41 | 42 | 43 | c_2,h_2 = self.gin(graph_shuf,graph_len) 44 | h_0,h_2 = self.sigm(h_0),self.sigm(h_2) 45 | 46 | 47 | # len_1 = int(h_0.shape[1]) 48 | # len_2 = int(h_2.shape[1]) 49 | # ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) 50 | # ret2 = self.disc(c_3, h_0, h_2, samp_bias1, samp_bias2) 51 | device = self.config["gpu_id"] 52 | # ret = ret1 + ret2 53 | # return ret,int(h_0.shape[1]),int(h_2.shape[1]) 54 | graph.add_self_loop() 55 | graph_1.add_self_loop() 56 | graph_2.add_self_loop() 57 | adj = graph.adjacency_matrix() 58 | adj_1 = graph_1.adjacency_matrix() 59 | adj_2 = graph_2.adjacency_matrix() 60 | adj = adj.to(device) 61 | adj_1 = adj_1.to(device) 62 | adj_2 = adj_2.to(device) 63 | # for count in range(self.config["pretrain_hop_num"]): 64 | # h_0 = torch.matmul(adj, h_0) 65 | # # h_1 = torch.matmul(adj_1, h_1) 66 | # h_2 = torch.matmul(adj, h_2) 67 | # # h_3 = torch.matmul(adj_2, h_3) 68 | return h_0,c_1,h_2,c_3 69 | 70 | # Detach the return variables 71 | def embed(self, graph, graph_len): 72 | return self.gin(graph,graph_len) 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /nodedownstream/embedding.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | from utils import extend_dimensions 5 | 6 | 7 | class NormalEmbedding(nn.Module): 8 | def __init__(self, input_dim, emb_dim): 9 | super(NormalEmbedding, self).__init__() 10 | self.input_dim = input_dim 11 | self.emb_dim = emb_dim 12 | self.emb_layer = nn.Linear(input_dim, emb_dim, bias=False) 13 | 14 | # init 15 | nn.init.normal_(self.emb_layer.weight, 0.0, 1.0) 16 | 17 | def increase_input_size(self, new_input_dim): 18 | assert new_input_dim >= self.input_dim 19 | if new_input_dim != self.input_dim: 20 | new_emb_layer = extend_dimensions(self.emb_layer, new_input_dim=new_input_dim, upper=False) 21 | del self.emb_layer 22 | self.emb_layer = new_emb_layer 23 | self.input_dim = new_input_dim 24 | 25 | def forward(self, x): 26 | emb = self.emb_layer(x) 27 | return emb 28 | 29 | class OrthogonalEmbedding(nn.Module): 30 | def __init__(self, input_dim, emb_dim): 31 | super(OrthogonalEmbedding, self).__init__() 32 | self.input_dim = input_dim 33 | self.emb_dim = emb_dim 34 | self.emb_layer = nn.Linear(input_dim, emb_dim, bias=False) 35 | 36 | # init 37 | nn.init.orthogonal_(self.emb_layer.weight) 38 | 39 | def increase_input_size(self, new_input_dim): 40 | assert new_input_dim >= self.input_dim 41 | if new_input_dim != self.input_dim: 42 | new_emb_layer = extend_dimensions(self.emb_layer, new_input_dim=new_input_dim, upper=False) 43 | del self.emb_layer 44 | self.emb_layer = new_emb_layer 45 | self.input_dim = new_input_dim 46 | 47 | def forward(self, x): 48 | emb = self.emb_layer(x) 49 | return emb 50 | 51 | class EquivariantEmbedding(nn.Module): 52 | def __init__(self, input_dim, emb_dim): 53 | super(EquivariantEmbedding, self).__init__() 54 | self.input_dim = input_dim 55 | self.emb_dim = emb_dim 56 | self.emb_layer = nn.Linear(input_dim, emb_dim, bias=False) 57 | 58 | # init 59 | nn.init.normal_(self.emb_layer.weight[:,0], 0.0, 1.0) 60 | emb_column = self.emb_layer.weight[:,0] 61 | with torch.no_grad(): 62 | for i in range(1, self.input_dim): 63 | self.emb_layer.weight[:,i].data.copy_(torch.roll(emb_column, i, 0)) 64 | 65 | def increase_input_size(self, new_input_dim): 66 | assert new_input_dim >= self.input_dim 67 | if new_input_dim != self.input_dim: 68 | new_emb_layer = extend_dimensions(self.emb_layer, new_input_dim=new_input_dim, upper=False) 69 | del self.emb_layer 70 | self.emb_layer = new_emb_layer 71 | self.input_dim = new_input_dim 72 | 73 | def forward(self, x): 74 | emb = self.emb_layer(x) 75 | return emb -------------------------------------------------------------------------------- /nodedownstream/filternet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | 5 | # class MaxGatedFilterNet(nn.Module): 6 | # def __init__(self, pattern_dim, graph_dim): 7 | # super(MaxGatedFilterNet, self).__init__() 8 | # self.g_layer = nn.Linear(graph_dim, pattern_dim) 9 | # self.f_layer = nn.Linear(pattern_dim, 1) 10 | 11 | # # init 12 | # scale = (1/pattern_dim)**0.5 13 | # nn.init.normal_(self.g_layer.weight, 0.0, scale) 14 | # nn.init.zeros_(self.g_layer.bias) 15 | # nn.init.normal_(self.f_layer.weight, 0.0, scale) 16 | # nn.init.ones_(self.f_layer.bias) 17 | 18 | # def forward(self, p_x, g_x): 19 | # max_x = torch.max(p_x, dim=1, keepdim=True)[0].float() 20 | # g_x = self.g_layer(g_x.float()) 21 | # f = self.f_layer(g_x * max_x) 22 | # return F.sigmoid(f) 23 | 24 | class MaxGatedFilterNet(nn.Module): 25 | def __init__(self): 26 | super(MaxGatedFilterNet, self).__init__() 27 | 28 | def forward(self, p_x, g_x): 29 | max_x = torch.max(p_x, dim=1, keepdim=True)[0] 30 | if max_x.dim() == 2: 31 | return g_x <= max_x 32 | else: 33 | return (g_x <= max_x).all(keepdim=True, dim=2) 34 | 35 | 36 | -------------------------------------------------------------------------------- /nodedownstream/flikcrtaskchoose.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import numpy as np 4 | 5 | 6 | 7 | train_config = { 8 | "max_npv": 8, # max_number_pattern_vertices: 8, 16, 32 9 | "max_npe": 8, # max_number_pattern_edges: 8, 16, 32 10 | "max_npvl": 8, # max_number_pattern_vertex_labels: 8, 16, 32 11 | "max_npel": 8, # max_number_pattern_edge_labels: 8, 16, 32 12 | 13 | "max_ngv": 64, # max_number_graph_vertices: 64, 512,4096 14 | "max_nge": 256, # max_number_graph_edges: 256, 2048, 16384 15 | "max_ngvl": 16, # max_number_graph_vertex_labels: 16, 64, 256 16 | "max_ngel": 16, # max_number_graph_edge_labels: 16, 64, 256 17 | 18 | "base": 2, 19 | 20 | "gpu_id": -1, 21 | "num_workers": 12, 22 | 23 | "epochs": 200, 24 | "batch_size": 512, 25 | "update_every": 1, # actual batch_sizer = batch_size * update_every 26 | "print_every": 100, 27 | "init_emb": "Equivariant", # None, Orthogonal, Normal, Equivariant 28 | "share_emb": True, # sharing embedding requires the same vector length 29 | "share_arch": True, # sharing architectures 30 | "dropout": 0.2, 31 | "dropatt": 0.2, 32 | 33 | "reg_loss": "MSE", # MAE, MSEl 34 | "bp_loss": "MSE", # MAE, MSE 35 | "bp_loss_slp": "anneal_cosine$1.0$0.01", # 0, 0.01, logistic$1.0$0.01, linear$1.0$0.01, cosine$1.0$0.01, 36 | # cyclical_logistic$1.0$0.01, cyclical_linear$1.0$0.01, cyclical_cosine$1.0$0.01 37 | # anneal_logistic$1.0$0.01, anneal_linear$1.0$0.01, anneal_cosine$1.0$0.01 38 | "lr": 0.001, 39 | "weight_decay": 0.00001, 40 | "max_grad_norm": 8, 41 | 42 | "pretrain_model": "GCN", 43 | 44 | "emb_dim": 128, 45 | "activation_function": "leaky_relu", # sigmoid, softmax, tanh, relu, leaky_relu, prelu, gelu 46 | 47 | "filter_net": "MaxGatedFilterNet", # None, MaxGatedFilterNet 48 | "predict_net": "SumPredictNet", # MeanPredictNet, SumPredictNet, MaxPredictNet, 49 | "predict_net_add_enc": True, 50 | "predict_net_add_degree": True, 51 | 52 | # MeanAttnPredictNet, SumAttnPredictNet, MaxAttnPredictNet, 53 | # MeanMemAttnPredictNet, SumMemAttnPredictNet, MaxMemAttnPredictNet, 54 | # DIAMNet 55 | # "predict_net_add_enc": True, 56 | # "predict_net_add_degree": True, 57 | "txl_graph_num_layers": 3, 58 | "txl_pattern_num_layers": 3, 59 | "txl_d_model": 128, 60 | "txl_d_inner": 128, 61 | "txl_n_head": 4, 62 | "txl_d_head": 4, 63 | "txl_pre_lnorm": True, 64 | "txl_tgt_len": 64, 65 | "txl_ext_len": 0, # useless in current settings 66 | "txl_mem_len": 64, 67 | "txl_clamp_len": -1, # max positional embedding index 68 | "txl_attn_type": 0, # 0 for Dai et al, 1 for Shaw et al, 2 for Vaswani et al, 3 for Al Rfou et al. 69 | "txl_same_len": False, 70 | 71 | "gcn_num_bases": 8, 72 | "gcn_regularizer": "bdd", # basis, bdd 73 | "gcn_graph_num_layers": 3, 74 | "gcn_hidden_dim": 128, 75 | "gcn_ignore_norm": False, # ignorm=True -> RGCN-SUM 76 | 77 | "graph_dir": "../data/debug/graphs", 78 | "save_data_dir": "../data/debug", 79 | "save_model_dir": "../dumps/debug", 80 | "save_pretrain_model_dir": "../dumps/MUTAGPreTrain/GCN", 81 | "graphslabel_dir":"../data/debug/graphs", 82 | "downstream_graph_dir": "../data/debug/graphs", 83 | "downstream_save_data_dir": "../data/debug", 84 | "downstream_save_model_dir": "../dumps/debug", 85 | "downstream_graphslabel_dir":"../data/debug/graphs", 86 | "temperature": 0.01, 87 | "graph_finetuning_input_dim": 8, 88 | "graph_finetuning_output_dim": 2, 89 | "graph_label_num":2, 90 | "seed": 0, 91 | "update_pretrain": False, 92 | "dropout": 0.5, 93 | "gcn_output_dim": 8, 94 | 95 | "prompt": "SUM", 96 | "prompt_output_dim": 2, 97 | "scalar": 1e3, 98 | 99 | "dataset_seed": 0, 100 | "train_shotnum": 50, 101 | "val_shotnum": 50, 102 | "few_shot_tasknum": 10, 103 | 104 | "save_fewshot_dir": "../data/FlickrPreTrainNodeClassification/fewshot", 105 | "select_fewshot_dir": ".../data/FlickrPreTrainNodeClassification/select", 106 | "None": True, 107 | 108 | "downstream_dropout": 0, 109 | "node_feature_dim": 18, 110 | "train_label_num": 6, 111 | "val_label_num": 6, 112 | "test_label_num": 6, 113 | "nhop_neighbour": 1 114 | } 115 | 116 | 117 | fewshot_dir = os.path.join(train_config["save_fewshot_dir"], "%s_trainshot_%s_valshot_%s_tasks" % 118 | (train_config["train_shotnum"], train_config["val_shotnum"], 119 | train_config["few_shot_tasknum"])) 120 | print(os.path.exists(fewshot_dir)) 121 | print("Load Few Shot") 122 | trainset = np.load(os.path.join(fewshot_dir, "train_dgl_dataset.npy"), allow_pickle=True).tolist() 123 | valset = np.load(os.path.join(fewshot_dir, "val_dgl_dataset.npy"), allow_pickle=True).tolist() 124 | testset = np.load(os.path.join(fewshot_dir, "test_dgl_dataset.npy"), allow_pickle=True).tolist() 125 | save=[0,1,3,8] 126 | rettrain=[] 127 | retval=[] 128 | rettest=[] 129 | for i in save: 130 | rettrain.append(trainset[i]) 131 | retval.append(valset[1]) 132 | rettest.append(testset[1]) 133 | 134 | selectdir = os.path.join(train_config["save_fewshot_dir"], "%s_trainshot_%s_valshot_%s_tasks" % 135 | (train_config["train_shotnum"], train_config["val_shotnum"], 136 | train_config["few_shot_tasknum"])) 137 | 138 | if train_config["None"]: 139 | rettrain = np.array(rettrain) 140 | retval = np.array(retval) 141 | rettest = np.array(rettest) 142 | 143 | else: 144 | trainset = np.load(os.path.join(selectdir, "train_dgl_dataset.npy"), allow_pickle=True).tolist() 145 | valset = np.load(os.path.join(selectdir, "val_dgl_dataset.npy"), allow_pickle=True).tolist() 146 | testset = np.load(os.path.join(selectdir, "test_dgl_dataset.npy"), allow_pickle=True).tolist() 147 | rettrain=rettrain.append(trainset) 148 | retval=rettrain.append(valset) 149 | rettest=rettrain.append(testset) 150 | rettrain = np.array(rettrain) 151 | retval = np.array(retval) 152 | rettest = np.array(rettest) 153 | 154 | np.save(os.path.join(fewshot_dir, "train_dgl_dataset"), rettrain) 155 | np.save(os.path.join(fewshot_dir, "val_dgl_dataset"), retval) 156 | np.save(os.path.join(fewshot_dir, "test_dgl_dataset"), rettest) 157 | 158 | -------------------------------------------------------------------------------- /nodedownstream/gat.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GAT(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GAT, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, self.g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | 26 | # create predict layersr 27 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 28 | 29 | def create_net(self, name, input_dim, **kw): 30 | num_layers = kw.get("num_layers", 1) 31 | hidden_dim = kw.get("hidden_dim", 64) 32 | num_rels = kw.get("num_rels", 1) 33 | num_bases = kw.get("num_bases", 8) 34 | regularizer = kw.get("regularizer", "basis") 35 | dropout = kw.get("dropout", 0.5) 36 | 37 | 38 | self.convs = torch.nn.ModuleList() 39 | 40 | gat1=dgl.nn.pytorch.conv.GATConv(in_feats=input_dim, out_feats=hidden_dim,num_heads=4,allow_zero_in_degree=True) 41 | gat2=dgl.nn.pytorch.conv.GATConv(in_feats=4*hidden_dim, out_feats=hidden_dim,num_heads=1,allow_zero_in_degree=True) 42 | 43 | self.convs.append(gat1) 44 | self.convs.append(gat2) 45 | 46 | return self.convs, hidden_dim 47 | 48 | 49 | #def forward(self, pattern, pattern_len, graph, graph_len): 50 | def forward(self, graph, graph_len): 51 | #bsz = pattern_len.size(0) 52 | # filter_gate选出了graph中与同构无关的节点的mask 53 | #gate = self.get_filter_gate(pattern, pattern_len, graph, graph_len) 54 | graph_output = graph.ndata["feature"] 55 | #xs = [] 56 | graph_output=F.relu(self.convs[0](graph,graph_output)) 57 | graph_output=graph_output.resize(graph_output.size(0),1,graph_output.size(1)*graph_output.size(2)).squeeze() 58 | graph_output=F.relu(self.convs[1](graph,graph_output)).squeeze() 59 | # for i in range(self.num_layers_num): 60 | # graph_output = F.relu(self.convs[i](graph,graph_output)) 61 | # xs.append(graph_output) 62 | #xpool= [] 63 | graph_embedding = graph_output 64 | graph_embedding = torch.sum(graph_embedding, dim=1) 65 | return graph_embedding,graph_output 66 | # for x in xs: 67 | # graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 68 | # graph_embedding = torch.sum(graph_embedding, dim=1) 69 | # xpool.append(graph_embedding) 70 | # x = torch.cat(xpool, -1) 71 | # return x,torch.cat(xs, -1) 72 | -------------------------------------------------------------------------------- /nodedownstream/gcl.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | from layers import AvgReadout, Discriminator, Discriminator2 4 | import pdb 5 | from gin_local import GIN 6 | from utils import split_and_batchify_graph_feats 7 | class GCL(nn.Module): 8 | def __init__(self, n_in, n_h,config): 9 | super(GCL, self).__init__() 10 | self.read = AvgReadout() 11 | self.sigm = nn.Sigmoid() 12 | self.disc = Discriminator(n_h*3) 13 | # self.disc2 = Discriminator2(n_h) 14 | self.gin = GIN(config) 15 | self.config = config 16 | 17 | 18 | def forward(self, graph,graph_shuf, graph_1, graph_2, graph_len, graph_len_1, graph_len_2, sparse, msk, samp_bias1, samp_bias2, aug_type): 19 | 20 | c_0,h_0 = self.gin(graph, graph_len) 21 | 22 | if aug_type == 'edge': 23 | 24 | # h_1 = self.gcn(seq1, aug_adj1, sparse) 25 | # h_3 = self.gcn(seq1, aug_adj2, sparse) 26 | pass 27 | 28 | elif aug_type == 'mask': 29 | 30 | # h_1 = self.gcn(seq3, adj, sparse) 31 | # h_3 = self.gcn(seq4, adj, sparse) 32 | pass 33 | 34 | elif aug_type == 'node' or aug_type == 'subgraph': 35 | 36 | c_1,h_1 = self.gin(graph_1, graph_len_1) 37 | c_3,h_3 = self.gin(graph_2, graph_len_2) 38 | 39 | else: 40 | assert False 41 | 42 | 43 | c_2,h_2 = self.gin(graph_shuf,graph_len) 44 | h_0,h_1,h_2,h_3 = self.sigm(h_0),self.sigm(h_1),self.sigm(h_2),self.sigm(h_3) 45 | 46 | 47 | # len_1 = int(h_0.shape[1]) 48 | # len_2 = int(h_2.shape[1]) 49 | # ret1 = self.disc(c_1, h_0, h_2, samp_bias1, samp_bias2) 50 | # ret2 = self.disc(c_3, h_0, h_2, samp_bias1, samp_bias2) 51 | device = self.config["gpu_id"] 52 | # ret = ret1 + ret2 53 | # return ret,int(h_0.shape[1]),int(h_2.shape[1]) 54 | graph.add_self_loop() 55 | graph_1.add_self_loop() 56 | graph_2.add_self_loop() 57 | adj = graph.adjacency_matrix() 58 | adj_1 = graph_1.adjacency_matrix() 59 | adj_2 = graph_2.adjacency_matrix() 60 | adj = adj.to(device) 61 | adj_1 = adj_1.to(device) 62 | adj_2 = adj_2.to(device) 63 | # for count in range(self.config["pretrain_hop_num"]): 64 | # h_0 = torch.matmul(adj, h_0) 65 | # h_1 = torch.matmul(adj_1, h_1) 66 | # h_2 = torch.matmul(adj, h_2) 67 | # h_3 = torch.matmul(adj_2, h_3) 68 | return h_0,h_1,h_2,h_3 69 | 70 | # Detach the return variables 71 | def embed(self, graph, graph_len): 72 | return self.gin(graph,graph_len) 73 | 74 | 75 | 76 | 77 | -------------------------------------------------------------------------------- /nodedownstream/gcn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GCN(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GCN, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | 26 | # create predict layersr 27 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 28 | 29 | def create_net(self, name, input_dim, **kw): 30 | num_layers = kw.get("num_layers", 1) 31 | hidden_dim = kw.get("hidden_dim", 64) 32 | num_rels = kw.get("num_rels", 1) 33 | num_bases = kw.get("num_bases", 8) 34 | regularizer = kw.get("regularizer", "basis") 35 | dropout = kw.get("dropout", 0.5) 36 | 37 | 38 | self.convs = torch.nn.ModuleList() 39 | 40 | for i in range(num_layers): 41 | 42 | if i: 43 | conv = dgl.nn.pytorch.conv.GraphConv(in_feats=hidden_dim, out_feats=hidden_dim,allow_zero_in_degree=True) 44 | else: 45 | conv = dgl.nn.pytorch.conv.GraphConv(in_feats=input_dim, out_feats=hidden_dim,allow_zero_in_degree=True) 46 | 47 | self.convs.append(conv) 48 | 49 | return self.convs, hidden_dim 50 | 51 | 52 | #def forward(self, pattern, pattern_len, graph, graph_len): 53 | def forward(self, graph, graph_len): 54 | #bsz = pattern_len.size(0) 55 | # filter_gate选出了graph中与同构无关的节点的mask 56 | #gate = self.get_filter_gate(pattern, pattern_len, graph, graph_len) 57 | graph_output = graph.ndata["feature"] 58 | xs = [] 59 | for i in range(self.num_layers_num): 60 | graph_output = F.relu(self.convs[i](graph,graph_output)) 61 | xs.append(graph_output) 62 | xpool= [] 63 | for x in xs: 64 | graph_embedding = x 65 | graph_embedding = torch.sum(graph_embedding, dim=1) 66 | xpool.append(graph_embedding) 67 | x = torch.cat(xpool, -1) 68 | return x,torch.cat(xs, -1) 69 | -------------------------------------------------------------------------------- /nodedownstream/gin_downstream.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | from node_prompt_layer import node_prompt_layer_linear_mean,node_prompt_layer_linear_sum,\ 12 | node_prompt_layer_feature_weighted_mean,node_prompt_layer_feature_weighted_sum,node_prompt_layer_sum 13 | 14 | class GIN_P(torch.nn.Module): 15 | def __init__(self, config): 16 | super(GIN_P, self).__init__() 17 | 18 | # create networks 19 | # get_emb_dim 返回固定值:128,128(128为config值) 20 | # g_net为n层gcn网络,g_dim=hidden_dim 21 | self.act=torch.nn.ReLU() 22 | self.convs, self.bns, g_dim ,self.prompts = self.create_net( 23 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 24 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"],node_feature_dim = config["node_feature_dim"],) 25 | self.num_layers_num=config["gcn_graph_num_layers"] 26 | self.dropout=torch.nn.Dropout(p=config["dropout"]) 27 | 28 | # create predict layersr 29 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 30 | 31 | def create_net(self, name, input_dim, **kw): 32 | num_layers = kw.get("num_layers", 1) 33 | hidden_dim = kw.get("hidden_dim", 64) 34 | num_rels = kw.get("num_rels", 1) 35 | num_bases = kw.get("num_bases", 8) 36 | regularizer = kw.get("regularizer", "basis") 37 | dropout = kw.get("dropout", 0.5) 38 | feature_dim = kw.get("node_feature_dim",64) 39 | 40 | convs = torch.nn.ModuleList() 41 | bns = torch.nn.ModuleList() 42 | prompts = torch.nn.ModuleList() 43 | a = int((hidden_dim * num_layers)/3) 44 | prompt_1 = node_prompt_layer_feature_weighted_sum(feature_dim) 45 | prompt_2 = node_prompt_layer_feature_weighted_sum(a) 46 | prompt_3 = node_prompt_layer_feature_weighted_sum(a) 47 | prompt_4 = node_prompt_layer_feature_weighted_sum(hidden_dim * num_layers) 48 | prompts.append(prompt_1) 49 | prompts.append(prompt_2) 50 | prompts.append(prompt_3) 51 | prompts.append(prompt_4) 52 | 53 | for i in range(num_layers): 54 | 55 | if i: 56 | nn = torch.nn.Sequential(torch.nn.Linear(hidden_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 57 | else: 58 | nn = torch.nn.Sequential(torch.nn.Linear(input_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 59 | conv = dgl.nn.pytorch.conv.GINConv(apply_func=nn,aggregator_type='sum') 60 | bn = torch.nn.BatchNorm1d(hidden_dim) 61 | 62 | convs.append(conv) 63 | bns.append(bn) 64 | 65 | return convs, bns, hidden_dim,prompts 66 | 67 | 68 | #def forward(self, pattern, pattern_len, graph, graph_len): 69 | def forward(self, graph, graph_len,prompt_id,scalar): 70 | graph_output = graph.ndata["feature"] 71 | xs = [] 72 | if prompt_id == 0: 73 | graph_output = self.prompts[0](graph_output,graph_len) 74 | for i in range(self.num_layers_num): 75 | graph_output = F.relu(self.convs[i](graph,graph_output)) 76 | graph_output = self.bns[i](graph_output) 77 | graph_output = self.dropout(graph_output) 78 | xs.append(graph_output) 79 | 80 | xpool= [] 81 | for x in xs: 82 | graph_embedding = x 83 | graph_embedding = torch.sum(graph_embedding, dim=1) 84 | xpool.append(graph_embedding) 85 | x = torch.cat(xpool, -1) 86 | #x is graph level embedding; xs is node level embedding 87 | embedding = torch.cat(xs, -1) 88 | # embedding =split_and_batchify_graph_feats(embedding, graph_len)[0] 89 | # embedding =embedding.mean(dim=1) 90 | return x,embedding 91 | elif prompt_id ==1: 92 | for i in range(self.num_layers_num): 93 | graph_output = F.relu(self.convs[i](graph,graph_output)) 94 | graph_output = self.bns[i](graph_output) 95 | graph_output = self.dropout(graph_output) 96 | xs.append(graph_output) 97 | if i ==0: 98 | graph_output = self.prompts[1](graph_output,graph.number_of_nodes()) 99 | 100 | 101 | xpool= [] 102 | for x in xs: 103 | graph_embedding = x 104 | graph_embedding = torch.sum(graph_embedding, dim=1) 105 | xpool.append(graph_embedding) 106 | x = torch.cat(xpool, -1) 107 | #x is graph level embedding; xs is node level embedding 108 | embedding = torch.cat(xs, -1) 109 | # embedding =split_and_batchify_graph_feats(embedding, graph_len)[0] 110 | # embedding =embedding.mean(dim=1) 111 | return x,embedding 112 | elif prompt_id ==2: 113 | for i in range(self.num_layers_num): 114 | graph_output = F.relu(self.convs[i](graph,graph_output)) 115 | graph_output = self.bns[i](graph_output) 116 | graph_output = self.dropout(graph_output) 117 | xs.append(graph_output) 118 | if i ==1: 119 | graph_output = self.prompts[2](graph_output,graph.number_of_nodes()) 120 | 121 | 122 | xpool= [] 123 | for x in xs: 124 | graph_embedding = x 125 | graph_embedding = torch.sum(graph_embedding, dim=1) 126 | xpool.append(graph_embedding) 127 | x = torch.cat(xpool, -1) 128 | embedding = torch.cat(xs, -1) 129 | # embedding =split_and_batchify_graph_feats(embedding, graph_len)[0] 130 | # embedding =embedding.mean(dim=1) 131 | #x is graph level embedding; xs is node level embedding 132 | return x,embedding 133 | else: 134 | for i in range(self.num_layers_num): 135 | graph_output = F.relu(self.convs[i](graph,graph_output)) 136 | graph_output = self.bns[i](graph_output) 137 | graph_output = self.dropout(graph_output) 138 | xs.append(graph_output) 139 | 140 | 141 | 142 | xpool= [] 143 | for x in xs: 144 | graph_embedding = x 145 | graph_embedding = torch.sum(graph_embedding, dim=1) 146 | xpool.append(graph_embedding) 147 | x = torch.cat(xpool, -1) 148 | embedding = torch.cat(xs, -1) 149 | embedding = self.prompts[3](embedding,0) 150 | 151 | #x is graph level embedding; xs is node level embedding 152 | return x,embedding 153 | 154 | -------------------------------------------------------------------------------- /nodedownstream/gin_flickr_DGI.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GIN(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GIN, self).__init__() 16 | 17 | self.act=torch.nn.ReLU() 18 | self.g_net, self.bns, g_dim = self.create_net( 19 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 20 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 21 | self.num_layers_num=config["gcn_graph_num_layers"] 22 | self.dropout=torch.nn.Dropout(p=config["dropout"]) 23 | 24 | 25 | def create_net(self, name, input_dim, **kw): 26 | num_layers = kw.get("num_layers", 1) 27 | hidden_dim = kw.get("hidden_dim", 64) 28 | num_rels = kw.get("num_rels", 1) 29 | num_bases = kw.get("num_bases", 8) 30 | regularizer = kw.get("regularizer", "basis") 31 | dropout = kw.get("dropout", 0.5) 32 | 33 | 34 | self.convs = torch.nn.ModuleList() 35 | self.bns = torch.nn.ModuleList() 36 | 37 | for i in range(num_layers): 38 | 39 | if i: 40 | nn = torch.nn.Sequential(torch.nn.Linear(hidden_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 41 | else: 42 | nn = torch.nn.Sequential(torch.nn.Linear(input_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 43 | conv = dgl.nn.pytorch.conv.GINConv(apply_func=nn,aggregator_type='sum') 44 | bn = torch.nn.BatchNorm1d(hidden_dim) 45 | 46 | self.convs.append(conv) 47 | self.bns.append(bn) 48 | 49 | return self.convs, self.bns, hidden_dim 50 | 51 | 52 | #def forward(self, pattern, pattern_len, graph, graph_len): 53 | def forward(self, graph, graph_len): 54 | graph_output = graph.ndata["feature"] 55 | xs = [] 56 | for i in range(self.num_layers_num): 57 | graph_output = F.relu(self.convs[i](graph,graph_output)) 58 | graph_output = self.bns[i](graph_output) 59 | graph_output = self.dropout(graph_output) 60 | xs.append(graph_output) 61 | xpool= [] 62 | for x in xs: 63 | 64 | # graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 65 | # else: 66 | graph_embedding=x 67 | # print(graph_embedding.size()) 68 | graph_embedding = torch.sum(graph_embedding, dim=0) 69 | 70 | # print(graph_embedding.size()) 71 | xpool.append(graph_embedding) 72 | x = torch.cat(xpool, -1) 73 | #x is graph level embedding; xs is node level embedding 74 | return x,torch.cat(xs, -1) 75 | -------------------------------------------------------------------------------- /nodedownstream/gin_local.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class GIN(torch.nn.Module): 14 | def __init__(self, config): 15 | super(GIN, self).__init__() 16 | 17 | self.act=torch.nn.ReLU() 18 | self.g_net, self.bns, g_dim = self.create_net( 19 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 20 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 21 | self.num_layers_num=config["gcn_graph_num_layers"] 22 | self.dropout=torch.nn.Dropout(p=config["dropout"]) 23 | 24 | 25 | def create_net(self, name, input_dim, **kw): 26 | num_layers = kw.get("num_layers", 1) 27 | hidden_dim = kw.get("hidden_dim", 64) 28 | num_rels = kw.get("num_rels", 1) 29 | num_bases = kw.get("num_bases", 8) 30 | regularizer = kw.get("regularizer", "basis") 31 | dropout = kw.get("dropout", 0.5) 32 | 33 | 34 | self.convs = torch.nn.ModuleList() 35 | self.bns = torch.nn.ModuleList() 36 | 37 | for i in range(num_layers): 38 | 39 | if i: 40 | nn = torch.nn.Sequential(torch.nn.Linear(hidden_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 41 | else: 42 | nn = torch.nn.Sequential(torch.nn.Linear(input_dim, hidden_dim), self.act, torch.nn.Linear(hidden_dim, hidden_dim)) 43 | conv = dgl.nn.pytorch.conv.GINConv(apply_func=nn,aggregator_type='sum') 44 | bn = torch.nn.BatchNorm1d(hidden_dim) 45 | 46 | self.convs.append(conv) 47 | self.bns.append(bn) 48 | 49 | return self.convs, self.bns, hidden_dim 50 | 51 | 52 | #def forward(self, pattern, pattern_len, graph, graph_len): 53 | def forward(self, graph, graph_len): 54 | graph_output = graph.ndata["feature"] 55 | xs = [] 56 | for i in range(self.num_layers_num): 57 | graph_output = F.relu(self.convs[i](graph,graph_output)) 58 | graph_output = self.bns[i](graph_output) 59 | graph_output = self.dropout(graph_output) 60 | xs.append(graph_output) 61 | xpool= [] 62 | for x in xs: 63 | 64 | # graph_embedding = split_and_batchify_graph_feats(x, graph_len)[0] 65 | # else: 66 | graph_embedding=x 67 | # print(graph_embedding.size()) 68 | graph_embedding = torch.sum(graph_embedding, dim=1) 69 | # print(graph_embedding.size()) 70 | xpool.append(graph_embedding) 71 | x = torch.cat(xpool, -1) 72 | #x is graph level embedding; xs is node level embedding 73 | return x,torch.cat(xs, -1) 74 | -------------------------------------------------------------------------------- /nodedownstream/graph_finetuning_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | import math 11 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 12 | 13 | class graph_finetuning_layer(nn.Module): 14 | def __init__(self, input_dim,output_dim): 15 | super(graph_finetuning_layer, self).__init__() 16 | self.linear=torch.nn.Linear(input_dim,output_dim) 17 | #self.dropout=torch.nn.Dropout(0.2) 18 | 19 | 20 | def forward(self,graph_embedding, graph_len): 21 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 22 | graph_embedding=torch.sum(graph_embedding,dim=1) 23 | #not the follows problem 24 | graph_embedding=self.linear(graph_embedding) 25 | #graph_embedding=torch.nn.functional.normalize(graph_embedding,dim=1) 26 | #graph_embedding=F.leaky_relu(graph_embedding,0.2) 27 | result = F.log_softmax(graph_embedding, dim=1) 28 | return result 29 | 30 | 31 | '''class graph_finetuning_layer(nn.Module): 32 | def __init__(self, input_dim,output_dim): 33 | super(graph_finetuning_layer, self).__init__() 34 | self.linear=torch.nn.Linear(input_dim,output_dim) 35 | self.softmax=torch.nn.Softmax(dim=1) 36 | 37 | def forward(self,graph_embedding, graph_len): 38 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 39 | graph_embedding=torch.sum(graph_embedding,dim=1) 40 | graph_embedding=self.linear(graph_embedding) 41 | graph_embedding=F.leaky_relu(graph_embedding) 42 | graph_embedding=F.log_softmax(graph_embedding,dim=1) 43 | #graph_embedding=F.softmax(graph_embedding,dim=1) 44 | #graph_embedding=torch.argmax(graph_embedding,dim=1,keepdim=True).float() 45 | #result=self.softmax(F.leaky_relu(graph_embedding)) 46 | #index=result.permute(1,0)[0] 47 | #index=index.unsqueeze(dim=1) 48 | index=torch.argmax(graph_embedding,dim=1,keepdim=True).float() 49 | index.requires_grad_(True) 50 | print(index.requires_grad) 51 | #return result 52 | #return graph_embedding 53 | return index''' -------------------------------------------------------------------------------- /nodedownstream/graph_prompt_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | #use prompt to finish step 1 13 | class graph_prompt_layer_mean(nn.Module): 14 | def __init__(self): 15 | super(graph_prompt_layer_mean, self).__init__() 16 | #先给予一个不用改的权重 17 | self.weight= torch.nn.Parameter(torch.Tensor(2, 2)) 18 | def forward(self, graph_embedding, graph_len): 19 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 20 | #prompt: mean 21 | graph_prompt_result=graph_embedding.mean(dim=1) 22 | return graph_prompt_result 23 | 24 | class node_prompt_layer_linear_mean(nn.Module): 25 | def __init__(self,input_dim,output_dim): 26 | super(node_prompt_layer_linear_mean, self).__init__() 27 | #先给予一个不用改的权重 28 | self.linear=torch.nn.Linear(input_dim,output_dim) 29 | 30 | def forward(self, graph_embedding, graph_len): 31 | graph_embedding=self.linear(graph_embedding) 32 | #graph_prompt_result=graph_embedding.mean(dim=1) 33 | #graph_prompt_result=torch.nn.functional.normalize(graph_prompt_result,dim=1) 34 | return graph_embedding 35 | 36 | class node_prompt_layer_linear_sum(nn.Module): 37 | def __init__(self,input_dim,output_dim): 38 | super(node_prompt_layer_linear_sum, self).__init__() 39 | #先给予一个不用改的权重 40 | self.linear=torch.nn.Linear(input_dim,output_dim) 41 | 42 | def forward(self, graph_embedding, graph_len): 43 | graph_embedding=self.linear(graph_embedding) 44 | #graph_prompt_result=torch.nn.functional.normalize(graph_embedding,dim=1) 45 | return graph_embedding 46 | 47 | 48 | 49 | #sum result is same as mean result 50 | class graph_prompt_layer_sum(nn.Module): 51 | def __init__(self): 52 | super(graph_prompt_layer_sum, self).__init__() 53 | #先给予一个不用改的权重 54 | self.weight= torch.nn.Parameter(torch.Tensor(2, 2)) 55 | def forward(self, graph_embedding, graph_len): 56 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 57 | #prompt: sum 58 | graph_prompt_result=graph_embedding.sum(dim=1) 59 | return graph_prompt_result 60 | 61 | #7.26 error verison 算法错误,没法解释,但可能其结果是有用的 62 | '''class graph_prompt_layer_weighted(nn.Module): 63 | def __init__(self,max_n_num): 64 | super(graph_prompt_layer_weighted, self).__init__() 65 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 66 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 67 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 68 | self.max_n_num=max_n_num 69 | self.reset_parameters() 70 | def reset_parameters(self): 71 | torch.nn.init.xavier_uniform_(self.weight) 72 | def forward(self, graph_embedding, graph_len): 73 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 74 | weight = self.weight[0][0:graph_embedding.size(1)] 75 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 76 | temp1 = weight * temp1 77 | graph_embedding=torch.matmul(graph_embedding,temp1) 78 | #prompt: mean 79 | graph_prompt_result=graph_embedding.mean(dim=1) 80 | return graph_prompt_result''' 81 | 82 | 83 | class graph_prompt_layer_weighted(nn.Module): 84 | def __init__(self,max_n_num): 85 | super(graph_prompt_layer_weighted, self).__init__() 86 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 87 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 88 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 89 | self.max_n_num=max_n_num 90 | self.reset_parameters() 91 | def reset_parameters(self): 92 | torch.nn.init.xavier_uniform_(self.weight) 93 | def forward(self, graph_embedding, graph_len): 94 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 95 | weight = self.weight[0][0:graph_embedding.size(1)] 96 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 97 | temp1 = weight * temp1 98 | temp1 = temp1.permute(0, 2, 1) 99 | graph_embedding=graph_embedding*temp1 100 | #prompt: mean 101 | graph_prompt_result=graph_embedding.sum(dim=1) 102 | return graph_prompt_result 103 | 104 | class node_prompt_layer_feature_weighted_mean(nn.Module): 105 | def __init__(self,input_dim): 106 | super(node_prompt_layer_feature_weighted_mean, self).__init__() 107 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 108 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 109 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 110 | self.max_n_num=input_dim 111 | self.reset_parameters() 112 | def reset_parameters(self): 113 | torch.nn.init.xavier_uniform_(self.weight) 114 | def forward(self, graph_embedding, graph_len): 115 | graph_embedding=graph_embedding*self.weight 116 | #prompt: mean 117 | return graph_embedding 118 | 119 | class node_prompt_layer_feature_weighted_sum(nn.Module): 120 | def __init__(self,input_dim): 121 | super(node_prompt_layer_feature_weighted_sum, self).__init__() 122 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 123 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 124 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 125 | self.max_n_num=input_dim 126 | self.reset_parameters() 127 | def reset_parameters(self): 128 | torch.nn.init.xavier_uniform_(self.weight) 129 | def forward(self, graph_embedding, graph_len): 130 | graph_embedding=graph_embedding*self.weight 131 | #prompt: mean 132 | return graph_embedding 133 | 134 | class graph_prompt_layer_weighted_matrix(nn.Module): 135 | def __init__(self,max_n_num,input_dim): 136 | super(graph_prompt_layer_weighted_matrix, self).__init__() 137 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 138 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 139 | self.weight= torch.nn.Parameter(torch.Tensor(input_dim,max_n_num)) 140 | self.max_n_num=max_n_num 141 | self.reset_parameters() 142 | def reset_parameters(self): 143 | torch.nn.init.xavier_uniform_(self.weight) 144 | def forward(self, graph_embedding, graph_len): 145 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 146 | weight = self.weight.permute(1, 0)[0:graph_embedding.size(1)] 147 | weight = weight.expand(graph_embedding.size(0), weight.size(0), weight.size(1)) 148 | graph_embedding = graph_embedding * weight 149 | #prompt: mean 150 | graph_prompt_result=graph_embedding.sum(dim=1) 151 | return graph_prompt_result 152 | 153 | class graph_prompt_layer_weighted_linear(nn.Module): 154 | def __init__(self,max_n_num,input_dim,output_dim): 155 | super(graph_prompt_layer_weighted_linear, self).__init__() 156 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 157 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 158 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 159 | self.linear=nn.Linear(input_dim,output_dim) 160 | self.max_n_num=max_n_num 161 | self.reset_parameters() 162 | def reset_parameters(self): 163 | torch.nn.init.xavier_uniform_(self.weight) 164 | def forward(self, graph_embedding, graph_len): 165 | graph_embedding=self.linear(graph_embedding) 166 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 167 | weight = self.weight[0][0:graph_embedding.size(1)] 168 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 169 | temp1 = weight * temp1 170 | temp1 = temp1.permute(0, 2, 1) 171 | graph_embedding=graph_embedding*temp1 172 | #prompt: mean 173 | graph_prompt_result = graph_embedding.mean(dim=1) 174 | return graph_prompt_result 175 | 176 | class graph_prompt_layer_weighted_matrix_linear(nn.Module): 177 | def __init__(self,max_n_num,input_dim,output_dim): 178 | super(graph_prompt_layer_weighted_matrix_linear, self).__init__() 179 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 180 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 181 | self.weight= torch.nn.Parameter(torch.Tensor(output_dim,max_n_num)) 182 | self.linear=nn.Linear(input_dim,output_dim) 183 | self.max_n_num=max_n_num 184 | self.reset_parameters() 185 | def reset_parameters(self): 186 | torch.nn.init.xavier_uniform_(self.weight) 187 | def forward(self, graph_embedding, graph_len): 188 | graph_embedding=self.linear(graph_embedding) 189 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 190 | weight = self.weight.permute(1, 0)[0:graph_embedding.size(1)] 191 | weight = weight.expand(graph_embedding.size(0), weight.size(0), weight.size(1)) 192 | graph_embedding = graph_embedding * weight 193 | #prompt: mean 194 | graph_prompt_result=graph_embedding.mean(dim=1) 195 | return graph_prompt_result 196 | -------------------------------------------------------------------------------- /nodedownstream/graphsage.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | 13 | class Graphsage(torch.nn.Module): 14 | def __init__(self, config): 15 | super(Graphsage, self).__init__() 16 | 17 | # create networks 18 | # get_emb_dim 返回固定值:128,128(128为config值) 19 | # g_net为n层gcn网络,g_dim=hidden_dim 20 | self.act=torch.nn.ReLU() 21 | self.g_net, g_dim = self.create_net( 22 | name="graph", input_dim=config["node_feature_dim"], hidden_dim=config["gcn_hidden_dim"], 23 | num_layers=config["gcn_graph_num_layers"], num_bases=config["gcn_num_bases"], regularizer=config["gcn_regularizer"]) 24 | self.num_layers_num=config["gcn_graph_num_layers"] 25 | 26 | # create predict layersr 27 | # 这两个if语句在embedding网络的基础上增加了pattern和graph输入predict的维度数 28 | 29 | def create_net(self, name, input_dim, **kw): 30 | num_layers = kw.get("num_layers", 1) 31 | hidden_dim = kw.get("hidden_dim", 64) 32 | num_rels = kw.get("num_rels", 1) 33 | num_bases = kw.get("num_bases", 8) 34 | regularizer = kw.get("regularizer", "basis") 35 | dropout = kw.get("dropout", 0.5) 36 | 37 | 38 | self.convs = torch.nn.ModuleList() 39 | 40 | for i in range(num_layers): 41 | 42 | if i: 43 | conv = dgl.nn.pytorch.conv.SAGEConv(in_feats=hidden_dim, out_feats=hidden_dim,aggregator_type="gcn") 44 | else: 45 | conv = dgl.nn.pytorch.conv.SAGEConv(in_feats=input_dim, out_feats=hidden_dim,aggregator_type="gcn") 46 | 47 | self.convs.append(conv) 48 | 49 | return self.convs, hidden_dim 50 | 51 | 52 | #def forward(self, pattern, pattern_len, graph, graph_len): 53 | def forward(self, graph, graph_len): 54 | #bsz = pattern_len.size(0) 55 | # filter_gate选出了graph中与同构无关的节点的mask 56 | #gate = self.get_filter_gate(pattern, pattern_len, graph, graph_len) 57 | graph_output = graph.ndata["feature"] 58 | xs = [] 59 | for i in range(self.num_layers_num): 60 | graph_output = F.relu(self.convs[i](graph,graph_output)) 61 | xs.append(graph_output) 62 | xpool= [] 63 | for x in xs: 64 | graph_embedding = x 65 | graph_embedding = torch.sum(graph_embedding, dim=1) 66 | xpool.append(graph_embedding) 67 | x = torch.cat(xpool, -1) 68 | return x,torch.cat(xs, -1) 69 | -------------------------------------------------------------------------------- /nodedownstream/layers/__init__.py: -------------------------------------------------------------------------------- 1 | from .gcn import GCN 2 | from .readout import AvgReadout 3 | from .discriminator import Discriminator 4 | from .discriminator2 import Discriminator2 -------------------------------------------------------------------------------- /nodedownstream/layers/__pycache__/__init__.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/nodedownstream/layers/__pycache__/__init__.cpython-36.pyc -------------------------------------------------------------------------------- /nodedownstream/layers/__pycache__/discriminator.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/nodedownstream/layers/__pycache__/discriminator.cpython-36.pyc -------------------------------------------------------------------------------- /nodedownstream/layers/__pycache__/discriminator2.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/nodedownstream/layers/__pycache__/discriminator2.cpython-36.pyc -------------------------------------------------------------------------------- /nodedownstream/layers/__pycache__/gcn.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/nodedownstream/layers/__pycache__/gcn.cpython-36.pyc -------------------------------------------------------------------------------- /nodedownstream/layers/__pycache__/readout.cpython-36.pyc: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gmcmt/graph_prompt_extension/ab2506246994fbbcf661d16abea40519aa6949b6/nodedownstream/layers/__pycache__/readout.cpython-36.pyc -------------------------------------------------------------------------------- /nodedownstream/layers/discriminator.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | c_x = torch.unsqueeze(c, 1) 20 | c_x = c_x.expand_as(h_pl) 21 | 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /nodedownstream/layers/discriminator2.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Discriminator2(nn.Module): 5 | def __init__(self, n_h): 6 | super(Discriminator2, self).__init__() 7 | self.f_k = nn.Bilinear(n_h, n_h, 1) 8 | 9 | for m in self.modules(): 10 | self.weights_init(m) 11 | 12 | def weights_init(self, m): 13 | if isinstance(m, nn.Bilinear): 14 | torch.nn.init.xavier_uniform_(m.weight.data) 15 | if m.bias is not None: 16 | m.bias.data.fill_(0.0) 17 | 18 | def forward(self, c, h_pl, h_mi, s_bias1=None, s_bias2=None): 19 | # c_x = torch.unsqueeze(c, 1) 20 | # c_x = c_x.expand_as(h_pl) 21 | c_x = c 22 | sc_1 = torch.squeeze(self.f_k(h_pl, c_x), 2) 23 | sc_2 = torch.squeeze(self.f_k(h_mi, c_x), 2) 24 | 25 | if s_bias1 is not None: 26 | sc_1 += s_bias1 27 | if s_bias2 is not None: 28 | sc_2 += s_bias2 29 | 30 | logits = torch.cat((sc_1, sc_2), 1) 31 | 32 | return logits 33 | 34 | -------------------------------------------------------------------------------- /nodedownstream/layers/gcn.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class GCN(nn.Module): 5 | def __init__(self, in_ft, out_ft, act, bias=True): 6 | super(GCN, self).__init__() 7 | self.fc = nn.Linear(in_ft, out_ft, bias=False) 8 | self.act = nn.PReLU() if act == 'prelu' else act 9 | 10 | if bias: 11 | self.bias = nn.Parameter(torch.FloatTensor(out_ft)) 12 | self.bias.data.fill_(0.0) 13 | else: 14 | self.register_parameter('bias', None) 15 | 16 | for m in self.modules(): 17 | self.weights_init(m) 18 | 19 | def weights_init(self, m): 20 | if isinstance(m, nn.Linear): 21 | torch.nn.init.xavier_uniform_(m.weight.data) 22 | if m.bias is not None: 23 | m.bias.data.fill_(0.0) 24 | 25 | # Shape of seq: (batch, nodes, features) 26 | def forward(self, seq, adj, sparse=False): 27 | seq_fts = self.fc(seq) 28 | if sparse: 29 | out = torch.unsqueeze(torch.spmm(adj, torch.squeeze(seq_fts, 0)), 0) 30 | else: 31 | out = torch.bmm(adj, seq_fts) 32 | if self.bias is not None: 33 | out += self.bias 34 | 35 | return self.act(out) 36 | 37 | -------------------------------------------------------------------------------- /nodedownstream/layers/readout.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | # Applies an average on seq, of shape (batch, nodes, features) 5 | # While taking into account the masking of msk 6 | class AvgReadout(nn.Module): 7 | def __init__(self): 8 | super(AvgReadout, self).__init__() 9 | 10 | def forward(self, seq, msk): 11 | if msk is None: 12 | return torch.mean(seq, 1) 13 | else: 14 | msk = torch.unsqueeze(msk, -1) 15 | return torch.sum(seq * msk, 1) / torch.sum(msk) 16 | 17 | -------------------------------------------------------------------------------- /nodedownstream/model_weight.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class model_weight(nn.Module): 5 | def __init__(self): 6 | super(model_weight, self).__init__() 7 | 8 | 9 | # self.weight_1 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 10 | # self.weight_2 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 11 | # self.weight_3 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 12 | # self.weight_4 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 13 | self.temp = torch.nn.Parameter(torch.Tensor(4,1),requires_grad=True) 14 | 15 | 16 | self.reset_parameters() 17 | 18 | def reset_parameters(self): 19 | 20 | # torch.nn.init.uniform_(self.weight_1, a=0.0, b=5.0) 21 | # torch.nn.init.uniform_(self.weight_2, a=0.0, b=5.0) 22 | # torch.nn.init.uniform_(self.weight_3, a=0.0, b=5.0) 23 | # torch.nn.init.uniform_(self.weight_4, a=0.0, b=5.0) 24 | torch.nn.init.uniform_(self.temp, a=0.0, b=3) 25 | def forward(self, graph_adj,weight_id): 26 | # temp = torch.Tensor([self.weight_1,self.weight_2,self.weight_3,self.weight_4]) 27 | temp = nn.functional.softmax(self.temp,dim =0) 28 | 29 | # size_ = graph_adj.size(0) 30 | # p = [i for i in range(size_)] 31 | # x = torch.tensor([p,p]) 32 | # q = [self.weight for i in range(size_)] 33 | # tt = torch.sparse_coo_tensor(x,q,(size_,size_)).to(graph_adj.device) 34 | # graph_adj = (graph_adj + tt) 35 | graph_adj = graph_adj.to(self.temp.device) 36 | # if weight_id == 0: 37 | # graph_adj = (graph_adj*(1.0+temp[0])) 38 | # elif weight_id == 1: 39 | # graph_adj = (graph_adj*temp[1]) 40 | # elif weight_id == 2: 41 | # graph_adj = (graph_adj*temp[2]) 42 | # else: 43 | # graph_adj = (graph_adj*temp[3]) 44 | 45 | 46 | 47 | return graph_adj 48 | -------------------------------------------------------------------------------- /nodedownstream/model_weight_fix.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class model_weight(nn.Module): 5 | def __init__(self): 6 | super(model_weight, self).__init__() 7 | 8 | 9 | # self.weight_1 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 10 | # self.weight_2 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 11 | # self.weight_3 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 12 | # self.weight_4 = torch.nn.Parameter(torch.Tensor(1,1),requires_grad=True) 13 | self.temp = torch.nn.Parameter(torch.Tensor(4,1),requires_grad=True) 14 | 15 | 16 | self.reset_parameters() 17 | 18 | def reset_parameters(self): 19 | 20 | # torch.nn.init.uniform_(self.weight_1, a=0.0, b=5.0) 21 | # torch.nn.init.uniform_(self.weight_2, a=0.0, b=5.0) 22 | # torch.nn.init.uniform_(self.weight_3, a=0.0, b=5.0) 23 | # torch.nn.init.uniform_(self.weight_4, a=0.0, b=5.0) 24 | torch.nn.init.uniform_(self.temp, a=0.0, b=0.1) 25 | def forward(self, graph_adj,weight_id): 26 | # temp = torch.Tensor([self.weight_1,self.weight_2,self.weight_3,self.weight_4]) 27 | temp = nn.functional.softmax(self.temp,dim =0) 28 | 29 | # size_ = graph_adj.size(0) 30 | # p = [i for i in range(size_)] 31 | # x = torch.tensor([p,p]) 32 | # q = [self.weight for i in range(size_)] 33 | # tt = torch.sparse_coo_tensor(x,q,(size_,size_)).to(graph_adj.device) 34 | # graph_adj = (graph_adj + tt) 35 | graph_adj = graph_adj.to(self.temp.device) 36 | if weight_id == 0: 37 | graph_adj = graph_adj 38 | # elif weight_id == 1: 39 | # graph_adj = (graph_adj*temp[1]) 40 | # elif weight_id == 2: 41 | # graph_adj = (graph_adj*temp[2]) 42 | # else: 43 | # graph_adj = (graph_adj*temp[3]) 44 | else: 45 | graph_adj = graph_adj*0 46 | 47 | return graph_adj 48 | -------------------------------------------------------------------------------- /nodedownstream/node_finetuning_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | import math 11 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 12 | 13 | class node_finetuning_layer(nn.Module): 14 | def __init__(self, input_dim,output_dim): 15 | super(node_finetuning_layer, self).__init__() 16 | self.linear=torch.nn.Linear(input_dim,output_dim) 17 | #self.dropout=torch.nn.Dropout(0.2) 18 | 19 | 20 | def forward(self,graph_embedding, graph_len): 21 | #not the follows problem 22 | graph_embedding=self.linear(graph_embedding) 23 | #graph_embedding=torch.nn.functional.normalize(graph_embedding,dim=1) 24 | #graph_embedding=F.leaky_relu(graph_embedding,0.2) 25 | result = F.log_softmax(graph_embedding, dim=1) 26 | return result 27 | 28 | 29 | '''class graph_finetuning_layer(nn.Module): 30 | def __init__(self, input_dim,output_dim): 31 | super(graph_finetuning_layer, self).__init__() 32 | self.linear=torch.nn.Linear(input_dim,output_dim) 33 | self.softmax=torch.nn.Softmax(dim=1) 34 | 35 | def forward(self,graph_embedding, graph_len): 36 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 37 | graph_embedding=torch.sum(graph_embedding,dim=1) 38 | graph_embedding=self.linear(graph_embedding) 39 | graph_embedding=F.leaky_relu(graph_embedding) 40 | graph_embedding=F.log_softmax(graph_embedding,dim=1) 41 | #graph_embedding=F.softmax(graph_embedding,dim=1) 42 | #graph_embedding=torch.argmax(graph_embedding,dim=1,keepdim=True).float() 43 | #result=self.softmax(F.leaky_relu(graph_embedding)) 44 | #index=result.permute(1,0)[0] 45 | #index=index.unsqueeze(dim=1) 46 | index=torch.argmax(graph_embedding,dim=1,keepdim=True).float() 47 | index.requires_grad_(True) 48 | print(index.requires_grad) 49 | #return result 50 | #return graph_embedding 51 | return index''' -------------------------------------------------------------------------------- /nodedownstream/node_prompt_layer.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | import torch.nn.functional as F 4 | import dgl 5 | import dgl.function as fn 6 | import copy 7 | from functools import partial 8 | from dgl.nn.pytorch.conv import RelGraphConv 9 | from basemodel import GraphAdjModel 10 | from utils import map_activation_str_to_layer, split_and_batchify_graph_feats,GetAdj 11 | 12 | #use prompt to finish step 1 13 | class graph_prompt_layer_mean(nn.Module): 14 | def __init__(self): 15 | super(graph_prompt_layer_mean, self).__init__() 16 | #先给予一个不用改的权重 17 | self.weight= torch.nn.Parameter(torch.Tensor(2, 2)) 18 | def forward(self, graph_embedding, graph_len): 19 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 20 | #prompt: mean 21 | graph_prompt_result=graph_embedding.mean(dim=1) 22 | return graph_prompt_result 23 | 24 | class node_prompt_layer_linear_mean(nn.Module): 25 | def __init__(self,input_dim,output_dim): 26 | super(node_prompt_layer_linear_mean, self).__init__() 27 | #先给予一个不用改的权重 28 | self.linear=torch.nn.Linear(input_dim,output_dim) 29 | 30 | def forward(self, graph_embedding, graph_len): 31 | graph_embedding=self.linear(graph_embedding) 32 | #graph_prompt_result=graph_embedding.mean(dim=1) 33 | #graph_prompt_result=torch.nn.functional.normalize(graph_prompt_result,dim=1) 34 | return graph_embedding 35 | 36 | class node_prompt_layer_linear_sum(nn.Module): 37 | def __init__(self,input_dim,output_dim): 38 | super(node_prompt_layer_linear_sum, self).__init__() 39 | #先给予一个不用改的权重 40 | self.linear=torch.nn.Linear(input_dim,output_dim) 41 | 42 | def forward(self, graph_embedding, graph_len): 43 | graph_embedding=self.linear(graph_embedding) 44 | #graph_prompt_result=torch.nn.functional.normalize(graph_embedding,dim=1) 45 | return graph_embedding 46 | 47 | 48 | 49 | #sum result is same as mean result 50 | class node_prompt_layer_sum(nn.Module): 51 | def __init__(self): 52 | super(node_prompt_layer_sum, self).__init__() 53 | #先给予一个不用改的权重 54 | self.weight= torch.nn.Parameter(torch.Tensor(2, 2)) 55 | def forward(self, graph_embedding, graph_len): 56 | return graph_embedding 57 | 58 | #7.26 error verison 算法错误,没法解释,但可能其结果是有用的 59 | '''class graph_prompt_layer_weighted(nn.Module): 60 | def __init__(self,max_n_num): 61 | super(graph_prompt_layer_weighted, self).__init__() 62 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 63 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 64 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 65 | self.max_n_num=max_n_num 66 | self.reset_parameters() 67 | def reset_parameters(self): 68 | torch.nn.init.xavier_uniform_(self.weight) 69 | def forward(self, graph_embedding, graph_len): 70 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 71 | weight = self.weight[0][0:graph_embedding.size(1)] 72 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 73 | temp1 = weight * temp1 74 | graph_embedding=torch.matmul(graph_embedding,temp1) 75 | #prompt: mean 76 | graph_prompt_result=graph_embedding.mean(dim=1) 77 | return graph_prompt_result''' 78 | 79 | 80 | class graph_prompt_layer_weighted(nn.Module): 81 | def __init__(self,max_n_num): 82 | super(graph_prompt_layer_weighted, self).__init__() 83 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 84 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 85 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 86 | self.max_n_num=max_n_num 87 | self.reset_parameters() 88 | def reset_parameters(self): 89 | torch.nn.init.xavier_uniform_(self.weight) 90 | def forward(self, graph_embedding, graph_len): 91 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 92 | weight = self.weight[0][0:graph_embedding.size(1)] 93 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 94 | temp1 = weight * temp1 95 | temp1 = temp1.permute(0, 2, 1) 96 | graph_embedding=graph_embedding*temp1 97 | #prompt: mean 98 | graph_prompt_result=graph_embedding.sum(dim=1) 99 | return graph_prompt_result 100 | 101 | class node_prompt_layer_feature_weighted_mean(nn.Module): 102 | def __init__(self,input_dim): 103 | super(node_prompt_layer_feature_weighted_mean, self).__init__() 104 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 105 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 106 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 107 | self.max_n_num=input_dim 108 | self.reset_parameters() 109 | def reset_parameters(self): 110 | torch.nn.init.xavier_uniform_(self.weight) 111 | def forward(self, graph_embedding, graph_len): 112 | graph_embedding=graph_embedding*self.weight 113 | #prompt: mean 114 | return graph_embedding 115 | 116 | class node_prompt_layer_feature_weighted_sum(nn.Module): 117 | def __init__(self,input_dim): 118 | super(node_prompt_layer_feature_weighted_sum, self).__init__() 119 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 120 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 121 | self.weight= torch.nn.Parameter(torch.Tensor(1,input_dim)) 122 | self.max_n_num=input_dim 123 | self.reset_parameters() 124 | def reset_parameters(self): 125 | torch.nn.init.xavier_uniform_(self.weight) 126 | def forward(self, graph_embedding, graph_len): 127 | graph_embedding=graph_embedding*self.weight 128 | #prompt: mean 129 | return graph_embedding 130 | 131 | class graph_prompt_layer_weighted_matrix(nn.Module): 132 | def __init__(self,max_n_num,input_dim): 133 | super(graph_prompt_layer_weighted_matrix, self).__init__() 134 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 135 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 136 | self.weight= torch.nn.Parameter(torch.Tensor(input_dim,max_n_num)) 137 | self.max_n_num=max_n_num 138 | self.reset_parameters() 139 | def reset_parameters(self): 140 | torch.nn.init.xavier_uniform_(self.weight) 141 | def forward(self, graph_embedding, graph_len): 142 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 143 | weight = self.weight.permute(1, 0)[0:graph_embedding.size(1)] 144 | weight = weight.expand(graph_embedding.size(0), weight.size(0), weight.size(1)) 145 | graph_embedding = graph_embedding * weight 146 | #prompt: mean 147 | graph_prompt_result=graph_embedding.sum(dim=1) 148 | return graph_prompt_result 149 | 150 | class graph_prompt_layer_weighted_linear(nn.Module): 151 | def __init__(self,max_n_num,input_dim,output_dim): 152 | super(graph_prompt_layer_weighted_linear, self).__init__() 153 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 154 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 155 | self.weight= torch.nn.Parameter(torch.Tensor(1,max_n_num)) 156 | self.linear=nn.Linear(input_dim,output_dim) 157 | self.max_n_num=max_n_num 158 | self.reset_parameters() 159 | def reset_parameters(self): 160 | torch.nn.init.xavier_uniform_(self.weight) 161 | def forward(self, graph_embedding, graph_len): 162 | graph_embedding=self.linear(graph_embedding) 163 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 164 | weight = self.weight[0][0:graph_embedding.size(1)] 165 | temp1 = torch.ones(graph_embedding.size(0), graph_embedding.size(2), graph_embedding.size(1)).to(graph_embedding.device) 166 | temp1 = weight * temp1 167 | temp1 = temp1.permute(0, 2, 1) 168 | graph_embedding=graph_embedding*temp1 169 | #prompt: mean 170 | graph_prompt_result = graph_embedding.mean(dim=1) 171 | return graph_prompt_result 172 | 173 | class graph_prompt_layer_weighted_matrix_linear(nn.Module): 174 | def __init__(self,max_n_num,input_dim,output_dim): 175 | super(graph_prompt_layer_weighted_matrix_linear, self).__init__() 176 | #assign a weight for each node while aggregating the nodes' embedding to get graph embedding 177 | #max_n_num+1从而使得graph embedding和weight对其情况下forward中align可以生效 178 | self.weight= torch.nn.Parameter(torch.Tensor(output_dim,max_n_num)) 179 | self.linear=nn.Linear(input_dim,output_dim) 180 | self.max_n_num=max_n_num 181 | self.reset_parameters() 182 | def reset_parameters(self): 183 | torch.nn.init.xavier_uniform_(self.weight) 184 | def forward(self, graph_embedding, graph_len): 185 | graph_embedding=self.linear(graph_embedding) 186 | graph_embedding=split_and_batchify_graph_feats(graph_embedding, graph_len)[0] 187 | weight = self.weight.permute(1, 0)[0:graph_embedding.size(1)] 188 | weight = weight.expand(graph_embedding.size(0), weight.size(0), weight.size(1)) 189 | graph_embedding = graph_embedding * weight 190 | #prompt: mean 191 | graph_prompt_result=graph_embedding.mean(dim=1) 192 | return graph_prompt_result 193 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | dgl-cuda11.3==0.9.1 2 | dgllife==0.3.2 3 | fvcore==0.1.5.post20221221 4 | PyYAML==6.0 5 | thop==0.1.1.post2209072238 6 | torch==1.10.1 7 | torchaudio==0.10.1 8 | torchvision==0.11.2 9 | 10 | --------------------------------------------------------------------------------