├── README.md ├── attack_maker ├── generate_metattack.py └── generate_ptb.py ├── clustering.py ├── create_env.sh ├── embedder.py ├── encoder ├── __init__.py └── gnn.py ├── figs ├── overall_architecure.jpg └── overall_architecure.pdf ├── main.py ├── models ├── SPAGCL_link.py ├── SPAGCL_node.py └── __init__.py ├── sh ├── clustering.sh ├── hetero_node.sh ├── link.sh ├── metattack_maker.sh ├── node.sh └── save_emb.sh └── utils ├── __init__.py ├── create_lp_data.py ├── data.py ├── transforms.py └── utils.py /README.md: -------------------------------------------------------------------------------- 1 | # Similarity-Preserving Adversarial Graph Contrastive Learning (SP-AGCL) 2 | 3 |

4 | 5 | 6 | 7 | 8 | 9 |

10 | 11 | The official source code for [**Similarity-Preserving Adversarial Graph Contrastive Learning**](https://arxiv.org/abs/2306.13854) at KDD 2023. 12 | 13 | Yeonjun In*, [Kanghoon Yoon*](https://kanghoonyoon.github.io/), and [Chanyoung Park](http://dsail.kaist.ac.kr/professor/) 14 | 15 | ## Abstract 16 | Adversarial attacks on a graph refer to imperceptible perturbations on the graph structure and node features, and it is well known that GNN models are vulnerable to such attacks. Among various GNN models, graph contrastive learning (GCL) based methods specifically suffer from adversarial attacks due to their inherent design that highly depends on the self-supervision signals derived from the original graph, which however already contains noise when the graph is attacked. Existing adversarial GCL methods adopt the adversarial training (AT) to the GCL framework to address adversarial attacks on graphs. By considering the attacked graph as an augmentation under the GCL framework, they achieve the adversarial robustness against graph structural attacks. However, we find that existing adversarially trained GCL methods achieve robustness at the expense of not being able to preserve the node similarity in terms of the node features, which is an unexpected consequence of applying AT to GCL models. In this paper, we propose a similarity-preserving adversarial graph contrastive learning (SP-AGCL) framework that contrasts the clean graph with two auxiliary views of different properties (i.e., the node similarity-preserving view and the adversarial view). Extensive experiments demonstrate that SP-AGCL achieves a competitive performance on several downstream tasks, and shows its effectiveness in various scenarios, e.g., a network with adversarial attacks, noisy labels, and heterophilous neighbors. 17 | 18 | ## Overall Archicteture 19 | 20 | 21 | 22 | 23 | 24 | ### Requirements 25 | * Python version: 3.7.11 26 | * Pytorch version: 1.10.2 27 | * torch-geometric version: 2.0.3 28 | * deeprobust 29 | 30 | ### How to Run 31 | * To run node classification (reproduce Table 1 in paper, Table 2 and 3 in appendix) 32 | 33 | ``` 34 | sh sh/node.sh 35 | ``` 36 | * To run link prediction (reproduce Figure 3(b) in paper) 37 | 38 | ``` 39 | sh sh/link.sh 40 | ``` 41 | * To run node clustering (reproduce Figure 3(c) in paper) 42 | * You should run node classification before node clustering since we use the embeddings learned in node classification. 43 | 44 | ``` 45 | sh sh/save_emb.sh # save node embedding from the best model of node classification 46 | sh sh/clustering.sh 47 | ``` 48 | * To run node classification on heterophilious network (reproduce Table 2 in paper) 49 | 50 | ``` 51 | sh sh/hetero_node.sh 52 | ``` 53 | 54 | ### Cite (Bibtex) 55 | - If you find ``SP-AGCL`` useful in your research, please cite the following paper: 56 | - Yeonjun In, Kanghoon Yoon, and Chanyoung Park. "Similarity Preserving Adversarial Graph Contrastive Learning." KDD 2023. 57 | - Bibtex 58 | ``` 59 | @article{in2023similarity, 60 | title={Similarity Preserving Adversarial Graph Contrastive Learning}, 61 | author={In, Yeonjun and Yoon, Kanghoon and Park, Chanyoung}, 62 | journal={arXiv preprint arXiv:2306.13854}, 63 | year={2023} 64 | } 65 | ``` 66 | 67 | -------------------------------------------------------------------------------- /attack_maker/generate_metattack.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import os 3 | import sys 4 | import numpy as np 5 | import torch.nn.functional as F 6 | import torch.optim as optim 7 | import scipy.sparse as sp 8 | from deeprobust.graph.defense import GCN 9 | from deeprobust.graph.global_attack import MetaApprox, Metattack 10 | from deeprobust.graph.utils import * 11 | from deeprobust.graph.data import Dataset 12 | import argparse 13 | from utils import get_data, set_everything, set_cuda_device 14 | from torch_geometric.utils import to_dense_adj 15 | 16 | parser = argparse.ArgumentParser() 17 | parser.add_argument('--device', default=6, type=int) 18 | parser.add_argument('--sub_size', default=3000, type=int) 19 | parser.add_argument('--seed', type=int, default=0, help='Random seed.') 20 | parser.add_argument('--lr', type=float, default=0.01, 21 | help='Initial learning rate.') 22 | parser.add_argument('--weight_decay', type=float, default=5e-4, 23 | help='Weight decay (L2 loss on parameters).') 24 | parser.add_argument('--hidden', type=int, default=16, 25 | help='Number of hidden units.') 26 | parser.add_argument('--dropout', type=float, default=0.5, 27 | help='Dropout rate (1 - keep probability).') 28 | parser.add_argument('--dataset', type=str, default='cs', choices=['cora', 'citeseer', 'photo', 'computers', 'cs', 'physics'], help='dataset') 29 | parser.add_argument('--ptb_rate', type=float, default=0.05, help='pertubation rate', choices=[0.05, 0.1, 0.15, 0.2, 0.25]) 30 | parser.add_argument('--ptb_n', type=int, default=200) 31 | parser.add_argument('--model', type=str, default='Meta-Self', choices=['A-Meta-Self', 'Meta-Self', 'Meta-Train','A-Meta-Train'], help='model variant') 32 | 33 | args = parser.parse_args() 34 | set_everything(args.seed) 35 | 36 | save_dict = {} 37 | 38 | set_cuda_device(args.device) 39 | device = f'cuda:{args.device}' 40 | data_home = f'./dataset/' 41 | data = get_data(data_home, args.dataset, 'meta', 0.0)[0] 42 | adj_np = to_dense_adj(data.edge_index)[0].numpy().astype(np.float32) 43 | adj = sp.csr_matrix(adj_np) 44 | 45 | features = data.x.numpy().astype(np.float32) 46 | features = sp.csr_matrix(features) 47 | labels = data.y.numpy() 48 | idx_train = data.train_mask[0, :].nonzero().flatten().numpy() 49 | idx_val = data.val_mask[0, :].nonzero().flatten().numpy() 50 | idx_test = data.test_mask[0, :].nonzero().flatten().numpy() 51 | idx_unlabeled = np.union1d(idx_val, idx_test) 52 | 53 | # for seed in range(int(args.ptb_rate*100)): 54 | 55 | nodes = torch.randperm(data.x.size(0))[:args.sub_size].sort()[0].numpy() 56 | save_dict['nodes'] = nodes 57 | sub_adj = adj[nodes, :][:, nodes] 58 | sub_x = features[nodes, :] 59 | sub_y = labels[nodes] 60 | 61 | sub_idx_train = np.sort(np.in1d(nodes, idx_train).nonzero()[0]) 62 | sub_idx_val = np.sort(np.in1d(nodes, idx_val).nonzero()[0]) 63 | sub_idx_test = np.sort(np.in1d(nodes, idx_test).nonzero()[0]) 64 | sub_idx_unlabeled = np.sort(np.in1d(nodes, idx_unlabeled).nonzero()[0]) 65 | 66 | perturbations = args.ptb_n ##int(0.01 * (sub_adj.sum()//2)) 67 | sub_adj, sub_x, sub_y = preprocess(sub_adj, sub_x, sub_y, preprocess_adj=False) 68 | 69 | save_dict['clean'] = sub_adj 70 | # Setup Surrogate Model 71 | surrogate = GCN(nfeat=sub_x.shape[1], nclass=sub_y.max().item()+1, nhid=16, 72 | dropout=0.5, with_relu=False, with_bias=True, weight_decay=5e-4, device=device) 73 | 74 | surrogate = surrogate.to(device) 75 | surrogate.fit(sub_x, sub_adj, sub_y, sub_idx_train) 76 | 77 | # Setup Attack Model 78 | if 'Self' in args.model: 79 | lambda_ = 0 80 | if 'Train' in args.model: 81 | lambda_ = 1 82 | if 'Both' in args.model: 83 | lambda_ = 0.5 84 | 85 | if 'A' in args.model: 86 | model = MetaApprox(model=surrogate, nnodes=sub_adj.shape[0], feature_shape=sub_x.shape, attack_structure=True, attack_features=False, device=device, lambda_=lambda_) 87 | 88 | else: 89 | model = Metattack(model=surrogate, nnodes=sub_adj.shape[0], feature_shape=sub_x.shape, attack_structure=True, attack_features=False, device=device, lambda_=lambda_) 90 | 91 | model = model.to(device) 92 | 93 | def test(adj): 94 | ''' test on GCN ''' 95 | 96 | # adj = normalize_adj_tensor(adj) 97 | gcn = GCN(nfeat=sub_x.shape[1], 98 | nhid=args.hidden, 99 | nclass=sub_y.max().item() + 1, 100 | dropout=args.dropout, device=device) 101 | gcn = gcn.to(device) 102 | gcn.fit(sub_x, sub_adj, sub_y, sub_idx_train) # train without model picking 103 | # gcn.fit(features, adj, labels, idx_train, idx_val) # train with validation model picking 104 | output = gcn.output.cpu() 105 | loss_test = F.nll_loss(output[sub_idx_test], sub_y[sub_idx_test]) 106 | acc_test = accuracy(output[sub_idx_test], sub_y[sub_idx_test]) 107 | print("Test set results:", 108 | "loss= {:.4f}".format(loss_test.item()), 109 | "accuracy= {:.4f}".format(acc_test.item())) 110 | 111 | return acc_test.item() 112 | 113 | 114 | def main(): 115 | model.attack(sub_x, sub_adj, sub_y, sub_idx_train, sub_idx_unlabeled, perturbations, ll_constraint=False) 116 | print('=== testing GCN on original(clean) graph ===') 117 | test(adj) 118 | modified_adj = model.modified_adj 119 | save_dict['modified'] = modified_adj 120 | # modified_features = model.modified_features 121 | test(modified_adj) 122 | 123 | # if you want to save the modified adj/features, uncomment the code below 124 | # model.save_adj(root='./perturbed_graph', name='{}_meta_adj_{}_{}'.format(args.dataset, args.ptb_rate, args.seed)) 125 | # torch.save(nodes, f'./perturbed_graph/{args.dataset}_meta_{args.ptb_rate}_subnodes_{args.seed}.pt') 126 | # model.save_features(root='./', name='mod_features') 127 | torch.save(save_dict, f'./perturbed_graph/{args.dataset}_meta_seed{args.seed}.pt') 128 | if __name__ == '__main__': 129 | main() 130 | -------------------------------------------------------------------------------- /attack_maker/generate_ptb.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import os 3 | import sys 4 | import numpy as np 5 | import torch.nn.functional as F 6 | import torch.optim as optim 7 | import scipy.sparse as sp 8 | from deeprobust.graph.defense import GCN 9 | from deeprobust.graph.global_attack import MetaApprox, Metattack 10 | from deeprobust.graph.utils import * 11 | from deeprobust.graph.data import Dataset 12 | import argparse 13 | from utils import get_data, set_everything, set_cuda_device 14 | from torch_geometric.utils import to_dense_adj, coalesce, to_undirected 15 | import pandas as pd 16 | 17 | ''' 18 | photo 6000, 12000, 18000, 24000, 30000 19 | computers 12000, 24000, 36000, 48000, 60000 20 | cs 4000, 8000, 12000, 16000, 20000 21 | physics 12000, 24000, 36000, 48000, 60000 22 | ''' 23 | 24 | dic = {'photo':30000//300, 'computers':60000//300, 'cs':20000//200, 'physics':60000//200} 25 | 26 | parser = argparse.ArgumentParser() 27 | parser.add_argument('--dataset', default='computers', type=str) 28 | parser.add_argument('--ptb_rate', default=0.05, type=float) 29 | 30 | args = parser.parse_args() 31 | 32 | data_home = f'./dataset/' 33 | data = get_data(data_home, args.dataset, 'meta', 0.0)[0] 34 | adj = to_dense_adj(data.edge_index)[0] 35 | adj_ = adj.clone() 36 | folders = os.listdir('perturbed_graph') 37 | folders = sorted([f for f in folders if args.dataset in f]) 38 | idx = int(dic[args.dataset] / (0.25/args.ptb_rate)) 39 | adjs = [] 40 | for f in folders[:idx]: 41 | tmp = torch.load(f'perturbed_graph/{f}', map_location='cpu') 42 | idx2nodes = {i:n for i, n in enumerate(tmp['nodes'])} 43 | print(f, (tmp['modified'].cpu()>tmp['clean']).sum(), (tmp['modified'].cpu()= 2 49 | self.gcn_module = gcn_module 50 | self.input_size, self.representation_size = layer_sizes[0], layer_sizes[-1] 51 | self.weight_standardization = weight_standardization 52 | 53 | total_layers = [] 54 | for in_dim, out_dim in zip(layer_sizes[:-1], layer_sizes[1:]): 55 | layers = [] 56 | layers.append((self.gcn_module(in_dim, out_dim), 'x, edge_index -> x'),) 57 | 58 | if batchnorm: 59 | layers.append(BatchNorm(out_dim, momentum=batchnorm_mm)) 60 | # else: 61 | # layers.append(LayerNorm(out_dim)) 62 | 63 | layers.append(nn.PReLU()) 64 | total_layers.append(Sequential('x, edge_index', layers)) 65 | 66 | self.model = nn.ModuleList(total_layers) 67 | # self.model = Sequential('x, edge_index, perturb', layers) 68 | 69 | def forward(self, x, adj, perturb=None): 70 | if self.weight_standardization: 71 | self.standardize_weights() 72 | 73 | for i, layer in enumerate(self.model): 74 | x = layer(x, adj) 75 | if perturb is not None and i==0: 76 | x += perturb 77 | return x 78 | 79 | def reset_parameters(self): 80 | for m in self.model: 81 | m.reset_parameters() 82 | 83 | def standardize_weights(self): 84 | skipped_first_conv = False 85 | for m in self.model.modules(): 86 | if isinstance(m, self.gcn_module): 87 | if not skipped_first_conv: 88 | skipped_first_conv = True 89 | continue 90 | weight = m.lin.weight.data 91 | var, mean = torch.var_mean(weight, dim=1, keepdim=True) 92 | weight = (weight - mean) / (torch.sqrt(var + 1e-5)) 93 | m.lin.weight.data = weight -------------------------------------------------------------------------------- /figs/overall_architecure.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yeonjun-in/torch-SP-AGCL/e829366bfaec5306ab7b436d5c2d6feba194e47b/figs/overall_architecure.jpg -------------------------------------------------------------------------------- /figs/overall_architecure.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/yeonjun-in/torch-SP-AGCL/e829366bfaec5306ab7b436d5c2d6feba194e47b/figs/overall_architecure.pdf -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import argparse 3 | from utils import set_everything 4 | import warnings 5 | warnings.filterwarnings("ignore") 6 | 7 | def parse_args(): 8 | # input arguments 9 | set_everything(1995) 10 | parser = argparse.ArgumentParser() 11 | 12 | parser.add_argument('--embedder', default='SPAGCL_node') 13 | parser.add_argument('--dataset', default='cora', choices=['cora', 'citeseer', 'pubmed', 'photo', 'computers', 'cs', 'physics', 'chameleon', 'squirrel', 'actor', 'texas', 'wisconsin', 'cornell']) 14 | parser.add_argument('--task', default='node', choices=['clustering', 'node', 'link']) 15 | parser.add_argument('--attack', type=str, default='meta', choices=['meta', 'nettack', 'random', 'feat_gau', 'feat_bern']) 16 | parser.add_argument('--attack_type', type=str, default='poison', choices=['poison', 'evasive']) 17 | if parser.parse_known_args()[0].attack_type in ['poison']: 18 | parser.add_argument('--ptb_rate', type=float, default=0.0) 19 | 20 | parser.add_argument('--seed_n', default=3, type=int) 21 | parser.add_argument('--epochs', type=int, default=1000) 22 | parser.add_argument("--layers", nargs='*', type=int, default=[512, 128], help="The number of units of each layer of the GNN. Default is [256]") 23 | 24 | parser.add_argument('--lr', type = float, default = 0.001) 25 | parser.add_argument('--wd', type=float, default=1e-5) 26 | 27 | parser.add_argument("--save_embed", action='store_true', default=False) 28 | 29 | parser.add_argument('--lambda_1', type=float, default=2.0) 30 | parser.add_argument('--lambda_2', type=float, default=2.0) 31 | 32 | parser.add_argument('--d_1', type=float, default=0.3) 33 | parser.add_argument('--d_2', type=float, default=0.2) 34 | parser.add_argument('--d_3', type=float, default=0.0) 35 | parser.add_argument("--bn", action='store_false', default=True) 36 | parser.add_argument('--warmup', type=int, default=0) 37 | parser.add_argument('--sub_size', type=int, default=5000) 38 | parser.add_argument('--add_edge_rate', type=float, default=0.3) 39 | parser.add_argument('--drop_feat_rate', type=float, default=0.3) 40 | parser.add_argument('--knn', type=int, default=10) 41 | 42 | parser.add_argument('--tau', type=float, default=0.4) 43 | parser.add_argument('--device', type=int, default=0) 44 | parser.add_argument('--patience', type=int, default=400) 45 | parser.add_argument('--verbose', type=int, default=10) 46 | 47 | parser.add_argument('--save_dir', type=str, default='./results') 48 | parser.add_argument('--save_fig', action='store_true', default=True) 49 | 50 | return parser.parse_known_args() 51 | 52 | 53 | def main(): 54 | args, _ = parse_args() 55 | args.drop_edge_rate = args.add_edge_rate 56 | 57 | assert ~(args.attack_type == 'poison' and args.ptb_rate == 0.0) 58 | if args.attack_type == 'evasive': 59 | args.ptb_rate = 0.0 60 | if '_link' in args.embedder: 61 | args.task = 'link' 62 | 63 | torch.cuda.set_device(args.device) 64 | 65 | if args.embedder == 'SPAGCL_node': 66 | from models import SPAGCL_node 67 | embedder = SPAGCL_node(args) 68 | if args.embedder == 'SPAGCL_link': 69 | from models import SPAGCL_link 70 | embedder = SPAGCL_link(args) 71 | 72 | embedder.training() 73 | 74 | if __name__ == '__main__': 75 | main() 76 | -------------------------------------------------------------------------------- /models/SPAGCL_link.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | from torch.optim import AdamW 4 | from embedder import embedder 5 | from encoder import GCN, GCNLayer 6 | from utils import get_graph_drop_transform, set_everything, dense_to_sparse_x, to_dense_subadj, get_data 7 | from copy import deepcopy 8 | from collections import defaultdict 9 | from torch_geometric.utils import to_undirected, to_dense_adj, dense_to_sparse, subgraph, add_self_loops, coalesce 10 | import torch.nn.functional as F 11 | from torch_geometric.data import Data 12 | import networkx as nx 13 | 14 | from utils.utils import to_dense_subadj 15 | 16 | class SPAGCL_link(embedder): 17 | def __init__(self, args): 18 | embedder.__init__(self, args) 19 | self.args = args 20 | 21 | def attack_adj(self, x1, x2, n_edge): 22 | n_nodes = len(x1.x) 23 | add_edge_num = int(self.args.add_edge_rate * n_edge) 24 | drop_edge_num = int(self.args.drop_edge_rate * n_edge) 25 | grad_sum = x1.edge_adj.grad + x2.edge_adj.grad 26 | grad_sum_1d = grad_sum.view(-1) 27 | values, indices = grad_sum_1d.sort() 28 | add_idx, drop_idx = indices[-add_edge_num:], indices[:drop_edge_num] 29 | 30 | add_idx_dense = torch.stack([add_idx // n_nodes, add_idx % n_nodes]) 31 | drop_idx_dense = torch.stack([drop_idx // n_nodes, drop_idx % n_nodes]) 32 | 33 | add = to_dense_adj(add_idx_dense, max_num_nodes=n_nodes)[0] 34 | drop = to_dense_adj(drop_idx_dense, max_num_nodes=n_nodes)[0] 35 | return add, 1-drop 36 | 37 | def attack_feat(self, x1, x2): 38 | n_nodes, n_dim = x1.x.size() 39 | drop_feat_num = int((n_dim * self.args.drop_feat_rate) * n_nodes) 40 | grad_sum = x1.x.grad + x2.x.grad 41 | grad_sum_1d = grad_sum.view(-1) 42 | values, indices = grad_sum_1d.sort() 43 | 44 | drop_idx = indices[:drop_feat_num] 45 | 46 | drop_idx_dense = torch.stack([drop_idx // n_dim, drop_idx % n_dim]) 47 | 48 | drop_sparse = dense_to_sparse_x(drop_idx_dense, n_nodes, n_dim) 49 | return 1-drop_sparse.to_dense() 50 | 51 | def training(self): 52 | 53 | self.train_result, self.val_result, self.test_result = defaultdict(list), defaultdict(list), defaultdict(list) 54 | self.best_epochs = [] 55 | for seed in range(self.args.seed_n): 56 | self.seed = seed 57 | set_everything(seed) 58 | 59 | data = self.data.clone() 60 | 61 | link_data = torch.load(f'dataset/link/{self.args.dataset}_link.pt') 62 | train_pos, train_neg, train_label = link_data['train_edges'].T, link_data['train_edges_neg'].T, link_data['train_label'].T 63 | test_pos, test_neg, test_label = link_data['test_edges'].T, link_data['test_edges_neg'].T, link_data['test_label'].T 64 | 65 | edge_index = train_pos.clone() 66 | train_edge = torch.cat((train_pos, train_neg), dim=1) 67 | test_edge = torch.cat((test_pos, test_neg), dim=1) 68 | 69 | if self.args.ptb_rate > 0.0: 70 | clean_data = get_data(self.data_home, self.args.dataset, self.args.attack, 0.0)[0] 71 | clean_adj = to_dense_adj(clean_data.edge_index)[0] 72 | ptb_adj = to_dense_adj(self.data.edge_index)[0] 73 | ptb_edge = (ptb_adj > clean_adj).nonzero().T 74 | 75 | edge_index = coalesce(torch.cat((edge_index, ptb_edge), dim=1)) 76 | train_edge = torch.cat((train_edge, ptb_edge), dim=1) 77 | train_label = torch.cat((train_label, torch.ones(ptb_edge.size(1)))) 78 | 79 | data.edge_index = edge_index 80 | data.train_edge_index, data.val_edge_index, data.test_edge_index = train_edge, test_edge, test_edge 81 | data.train_label, data.val_label, data.test_label = train_label, test_label, test_label 82 | 83 | knn_data = Data() 84 | sim = F.normalize(data.x).mm(F.normalize(data.x).T).fill_diagonal_(0.0) 85 | dst = sim.topk(self.args.knn, 1)[1] 86 | src = torch.arange(data.x.size(0)).unsqueeze(1).expand_as(sim.topk(self.args.knn, 1)[1]) 87 | edge_index = torch.stack([src.reshape(-1), dst.reshape(-1)]) 88 | edge_index = to_undirected(edge_index) 89 | knn_data.x = deepcopy(data.x) 90 | knn_data.edge_index = edge_index 91 | data = data.cuda() 92 | knn_data = knn_data.cuda() 93 | 94 | data.edge_adj = to_dense_adj(data.edge_index, max_num_nodes=data.x.size(0))[0].to_sparse() 95 | 96 | transform_1 = get_graph_drop_transform(drop_edge_p=self.args.d_1, drop_feat_p=self.args.d_1) 97 | transform_2 = get_graph_drop_transform(drop_edge_p=self.args.d_2, drop_feat_p=self.args.d_2) 98 | transform_3 = get_graph_drop_transform(drop_edge_p=self.args.d_3, drop_feat_p=self.args.d_3) 99 | 100 | self.encoder = GCN(GCNLayer, [self.args.in_dim] + self.args.layers, batchnorm=self.args.bn) # 512, 256, 128 101 | self.model = modeler(self.encoder, self.args.layers[-1], self.args.layers[-1], self.args.tau).cuda() 102 | self.optimizer = AdamW(self.model.parameters(), lr=self.args.lr, weight_decay=self.args.wd) 103 | 104 | best, best_epochs, cnt_wait = 0, 0, 0 105 | for epoch in range(1, self.args.epochs+1): 106 | 107 | sub1, sub2 = self.subgraph_sampling(data, knn_data) 108 | self.model.train() 109 | self.optimizer.zero_grad() 110 | 111 | x1, x2, x_knn, x_adv = transform_1(sub1), transform_2(sub1), transform_3(sub2), deepcopy(sub1) 112 | x1.edge_adj, x2.edge_adj = to_dense_subadj(x1.edge_index, self.sample_size), to_dense_subadj(x2.edge_index, self.sample_size) 113 | x_knn.edge_adj, x_adv.edge_adj = to_dense_subadj(x_knn.edge_index, self.sample_size), to_dense_subadj(x_adv.edge_index, self.sample_size) 114 | 115 | if epoch > self.args.warmup: 116 | x1.edge_adj = x1.edge_adj.requires_grad_() 117 | x2.edge_adj = x2.edge_adj.requires_grad_() 118 | 119 | x1.x = x1.x.requires_grad_() 120 | x2.x = x2.x.requires_grad_() 121 | 122 | z1 = self.model(x1.x, x1.edge_adj) 123 | z2 = self.model(x2.x, x2.edge_adj) 124 | loss = self.model.loss(z1, z2, batch_size=0) 125 | 126 | loss.backward() 127 | 128 | if epoch > self.args.warmup: 129 | n_edge = int(x1.edge_adj.sum().item()) 130 | add_edge, masking_edge = self.attack_adj(x1, x2, n_edge=n_edge) 131 | masking_feat = self.attack_feat(x1, x2) 132 | 133 | x1.edge_adj, x2.edge_adj = x1.edge_adj.detach(), x2.edge_adj.detach() 134 | x_adv.edge_adj = ((x1.edge_adj*masking_edge) + add_edge*1.0).clamp(0, 1).detach() 135 | 136 | x1.x, x2.x = x1.x.detach(), x2.x.detach() 137 | x_adv.x = (x1.x*masking_feat).detach() 138 | 139 | x_knn.x = x_knn.x.detach() 140 | x_knn.edge_adj = x_knn.edge_adj.detach() 141 | 142 | z1 = self.model(x1.x, x1.edge_adj.to_sparse()) 143 | z2 = self.model(x2.x, x2.edge_adj.to_sparse()) 144 | z_adv = self.model(x_adv.x, x_adv.edge_adj.to_sparse()) 145 | z_knn = self.model(x_knn.x, x_knn.edge_adj.to_sparse()) 146 | 147 | loss = self.model.loss(z1, z2, batch_size=0)*0.5 148 | loss += self.model.loss(z1, z_knn, batch_size=0) 149 | loss += self.model.loss(z1, z_adv, batch_size=0) 150 | 151 | self.optimizer.zero_grad() 152 | loss.backward() 153 | 154 | self.optimizer.step() 155 | 156 | print(f'Epoch {epoch}: Loss {loss.item()}') 157 | 158 | if epoch % self.args.verbose == 0: 159 | _, val_acc, _ = self.verbose_link(data) 160 | if val_acc > best: 161 | best = val_acc 162 | cnt_wait = 0 163 | best_epochs = epoch 164 | torch.save(self.model.online_encoder.state_dict(), '{}/saved_model/best_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.attack_type, self.args.embedder, seed)) 165 | else: 166 | cnt_wait += self.args.verbose 167 | 168 | if cnt_wait == self.args.patience: 169 | print('Early stopping!') 170 | break 171 | 172 | self.best_epochs.append(best_epochs) 173 | self.model.online_encoder.load_state_dict(torch.load('{}/saved_model/best_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.attack_type, self.args.embedder, seed), map_location=f'cuda:{self.args.device}')) 174 | only_clean = True if self.args.dataset in ['photo', 'computers', 'cs', 'physics', 'amz', 'amz2', 'squirrel', 'chameleon',] else False 175 | self.eval_link(data) 176 | 177 | self.summary_result() 178 | 179 | def subgraph_sampling(self, data1, data2): 180 | self.sample_size = min(self.args.sub_size, self.args.n_node) 181 | nodes = torch.randperm(data1.x.size(0))[:self.sample_size].sort()[0] 182 | edge1, edge2 = add_self_loops(data1.edge_index, num_nodes=data1.x.size(0))[0], add_self_loops(data2.edge_index, num_nodes=data1.x.size(0))[0] 183 | edge1 = subgraph(subset=nodes, edge_index=edge1, relabel_nodes=True)[0] 184 | edge2 = subgraph(subset=nodes, edge_index=edge2, relabel_nodes=True)[0] 185 | 186 | tmp1, tmp2 = Data(), Data() 187 | tmp1.x, tmp2.x = data1.x[nodes], data2.x[nodes] 188 | tmp1.edge_index, tmp2.edge_index = edge1, edge2 189 | 190 | return tmp1, tmp2 191 | 192 | class modeler(torch.nn.Module): 193 | def __init__(self, encoder, num_hidden: int, num_proj_hidden: int, 194 | tau: float = 0.5): 195 | super(modeler, self).__init__() 196 | self.online_encoder = encoder 197 | self.tau: float = tau 198 | 199 | self.fc1 = torch.nn.Linear(num_hidden, num_proj_hidden) 200 | self.fc2 = torch.nn.Linear(num_proj_hidden, num_hidden) 201 | 202 | def forward(self, x: torch.Tensor, 203 | edge_index: torch.Tensor) -> torch.Tensor: 204 | return self.online_encoder(x, edge_index) 205 | 206 | def projection(self, z: torch.Tensor) -> torch.Tensor: 207 | z = F.elu(self.fc1(z)) 208 | return self.fc2(z) 209 | 210 | def sim(self, z1: torch.Tensor, z2: torch.Tensor): 211 | z1 = F.normalize(z1) 212 | z2 = F.normalize(z2) 213 | return torch.mm(z1, z2.t()) 214 | 215 | def semi_loss(self, z1: torch.Tensor, z2: torch.Tensor): 216 | f = lambda x: torch.exp(x / self.tau) 217 | refl_sim = f(self.sim(z1, z1)) 218 | between_sim = f(self.sim(z1, z2)) 219 | 220 | return -torch.log( 221 | between_sim.diag() 222 | / (refl_sim.sum(1) + between_sim.sum(1) - refl_sim.diag())) 223 | 224 | def batched_semi_loss(self, z1: torch.Tensor, z2: torch.Tensor, 225 | batch_size: int): 226 | # Space complexity: O(BN) (semi_loss: O(N^2)) 227 | device = z1.device 228 | num_nodes = z1.size(0) 229 | num_batches = (num_nodes - 1) // batch_size + 1 230 | f = lambda x: torch.exp(x / self.tau) 231 | indices = torch.arange(0, num_nodes).to(device) 232 | losses = [] 233 | 234 | for i in range(num_batches): 235 | mask = indices[i * batch_size:(i + 1) * batch_size] 236 | refl_sim = f(self.sim(z1[mask], z1)) # [B, N] 237 | between_sim = f(self.sim(z1[mask], z2)) # [B, N] 238 | 239 | losses.append(-torch.log( 240 | between_sim[:, i * batch_size:(i + 1) * batch_size].diag() 241 | / (refl_sim.sum(1) + between_sim.sum(1) 242 | - refl_sim[:, i * batch_size:(i + 1) * batch_size].diag()))) 243 | 244 | return torch.cat(losses) 245 | 246 | def loss(self, z1: torch.Tensor, z2: torch.Tensor, 247 | mean: bool = True, batch_size: int = 0): 248 | h1 = self.projection(z1) 249 | h2 = self.projection(z2) 250 | 251 | if batch_size == 0: 252 | l1 = self.semi_loss(h1, h2) 253 | l2 = self.semi_loss(h2, h1) 254 | else: 255 | l1 = self.batched_semi_loss(h1, h2, batch_size) 256 | l2 = self.batched_semi_loss(h2, h1, batch_size) 257 | 258 | ret = (l1 + l2) * 0.5 259 | ret = ret.mean() if mean else ret.sum() 260 | 261 | return ret 262 | -------------------------------------------------------------------------------- /models/SPAGCL_node.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.optim import AdamW 3 | from embedder import embedder 4 | from encoder import GCN, GCNLayer 5 | from utils import get_graph_drop_transform, set_everything, dense_to_sparse_x, to_dense_subadj 6 | from copy import deepcopy 7 | from collections import defaultdict 8 | from torch_geometric.utils import to_undirected, to_dense_adj, dense_to_sparse, subgraph, add_self_loops 9 | import torch.nn.functional as F 10 | from torch_geometric.data import Data 11 | from utils.utils import to_dense_subadj 12 | 13 | class SPAGCL_node(embedder): 14 | def __init__(self, args): 15 | embedder.__init__(self, args) 16 | self.args = args 17 | 18 | def attack_adj(self, x1, x2, n_edge): 19 | n_nodes = len(x1.x) 20 | add_edge_num = int(self.args.add_edge_rate * n_edge) 21 | drop_edge_num = int(self.args.drop_edge_rate * n_edge) 22 | grad_sum = x1.edge_adj.grad + x2.edge_adj.grad 23 | grad_sum_1d = grad_sum.view(-1) 24 | values, indices = grad_sum_1d.sort() 25 | add_idx, drop_idx = indices[-add_edge_num:], indices[:drop_edge_num] 26 | 27 | add_idx_dense = torch.stack([add_idx // n_nodes, add_idx % n_nodes]) 28 | drop_idx_dense = torch.stack([drop_idx // n_nodes, drop_idx % n_nodes]) 29 | 30 | add = to_dense_adj(add_idx_dense, max_num_nodes=n_nodes)[0] 31 | drop = to_dense_adj(drop_idx_dense, max_num_nodes=n_nodes)[0] 32 | return add, 1-drop 33 | 34 | def attack_feat(self, x1, x2): 35 | n_nodes, n_dim = x1.x.size() 36 | drop_feat_num = int((n_dim * self.args.drop_feat_rate) * n_nodes) 37 | grad_sum = x1.x.grad + x2.x.grad 38 | grad_sum_1d = grad_sum.view(-1) 39 | values, indices = grad_sum_1d.sort() 40 | 41 | drop_idx = indices[:drop_feat_num] 42 | 43 | drop_idx_dense = torch.stack([drop_idx // n_dim, drop_idx % n_dim]) 44 | 45 | drop_sparse = dense_to_sparse_x(drop_idx_dense, n_nodes, n_dim) 46 | return 1-drop_sparse.to_dense() 47 | 48 | def subgraph_sampling(self, data1, data2): 49 | self.sample_size = min(self.args.sub_size, self.args.n_node) 50 | nodes = torch.randperm(data1.x.size(0))[:self.sample_size].sort()[0] 51 | edge1, edge2 = add_self_loops(data1.edge_index, num_nodes=data1.x.size(0))[0], add_self_loops(data2.edge_index, num_nodes=data1.x.size(0))[0] 52 | edge1 = subgraph(subset=nodes, edge_index=edge1, relabel_nodes=True)[0] 53 | edge2 = subgraph(subset=nodes, edge_index=edge2, relabel_nodes=True)[0] 54 | 55 | tmp1, tmp2 = Data(), Data() 56 | tmp1.x, tmp2.x = data1.x[nodes], data2.x[nodes] 57 | tmp1.edge_index, tmp2.edge_index = edge1, edge2 58 | 59 | return tmp1, tmp2 60 | 61 | def training(self): 62 | 63 | self.train_result, self.val_result, self.test_result = defaultdict(list), defaultdict(list), defaultdict(list) 64 | for seed in range(self.args.seed_n): 65 | self.seed = seed 66 | set_everything(seed) 67 | 68 | data = self.data.clone() 69 | 70 | knn_data = Data() 71 | sim = F.normalize(data.x).mm(F.normalize(data.x).T).fill_diagonal_(0.0) 72 | dst = sim.topk(self.args.knn, 1)[1] 73 | src = torch.arange(data.x.size(0)).unsqueeze(1).expand_as(sim.topk(self.args.knn, 1)[1]) 74 | edge_index = torch.stack([src.reshape(-1), dst.reshape(-1)]) 75 | edge_index = to_undirected(edge_index) 76 | knn_data.x = deepcopy(data.x) 77 | knn_data.edge_index = edge_index 78 | data = data.cuda() 79 | knn_data = knn_data.cuda() 80 | 81 | data.edge_adj = to_dense_adj(data.edge_index, max_num_nodes=data.x.size(0))[0].to_sparse() 82 | 83 | transform_1 = get_graph_drop_transform(drop_edge_p=self.args.d_1, drop_feat_p=self.args.d_1) 84 | transform_2 = get_graph_drop_transform(drop_edge_p=self.args.d_2, drop_feat_p=self.args.d_2) 85 | transform_3 = get_graph_drop_transform(drop_edge_p=self.args.d_3, drop_feat_p=self.args.d_3) 86 | 87 | self.encoder = GCN(GCNLayer, [self.args.in_dim] + self.args.layers, batchnorm=self.args.bn) # 512, 256, 128 88 | self.model = modeler(self.encoder, self.args.layers[-1], self.args.layers[-1], self.args.tau).cuda() 89 | self.optimizer = AdamW(self.model.parameters(), lr=self.args.lr, weight_decay=self.args.wd) 90 | 91 | best, cnt_wait = 0, 0 92 | for epoch in range(1, self.args.epochs+1): 93 | 94 | sub1, sub2 = self.subgraph_sampling(data, knn_data) 95 | self.model.train() 96 | self.optimizer.zero_grad() 97 | 98 | x1, x2, x_knn, x_adv = transform_1(sub1), transform_2(sub1), transform_3(sub2), deepcopy(sub1) 99 | x1.edge_adj, x2.edge_adj = to_dense_subadj(x1.edge_index, self.sample_size), to_dense_subadj(x2.edge_index, self.sample_size) 100 | x_knn.edge_adj, x_adv.edge_adj = to_dense_subadj(x_knn.edge_index, self.sample_size), to_dense_subadj(x_adv.edge_index, self.sample_size) 101 | 102 | if epoch > self.args.warmup: 103 | x1.edge_adj = x1.edge_adj.requires_grad_() 104 | x2.edge_adj = x2.edge_adj.requires_grad_() 105 | 106 | x1.x = x1.x.requires_grad_() 107 | x2.x = x2.x.requires_grad_() 108 | 109 | z1 = self.model(x1.x, x1.edge_adj) 110 | z2 = self.model(x2.x, x2.edge_adj) 111 | loss = self.model.loss(z1, z2, batch_size=0) 112 | 113 | loss.backward() 114 | 115 | if epoch > self.args.warmup: 116 | n_edge = int(x1.edge_adj.sum().item()) 117 | add_edge, masking_edge = self.attack_adj(x1, x2, n_edge=n_edge) 118 | masking_feat = self.attack_feat(x1, x2) 119 | 120 | x1.edge_adj, x2.edge_adj = x1.edge_adj.detach(), x2.edge_adj.detach() 121 | x_adv.edge_adj = ((x1.edge_adj*masking_edge) + add_edge*1.0).clamp(0, 1).detach() 122 | 123 | x1.x, x2.x = x1.x.detach(), x2.x.detach() 124 | x_adv.x = (x1.x*masking_feat).detach() 125 | 126 | x_knn.x = x_knn.x.detach() 127 | x_knn.edge_adj = x_knn.edge_adj.detach() 128 | 129 | z1 = self.model(x1.x, x1.edge_adj.to_sparse()) 130 | z2 = self.model(x2.x, x2.edge_adj.to_sparse()) 131 | z_adv = self.model(x_adv.x, x_adv.edge_adj.to_sparse()) 132 | z_knn = self.model(x_knn.x, x_knn.edge_adj.to_sparse()) 133 | loss = self.model.loss(z1, z2, batch_size=0)*0.5 134 | loss += self.model.loss(z1, z_adv, batch_size=0)*self.args.lambda_1*0.5 135 | loss += self.model.loss(z1, z_knn, batch_size=0)*self.args.lambda_2*0.5 136 | # print(self.args.lambda_1*0.5, self.args.lambda_2*0.5) 137 | self.optimizer.zero_grad() 138 | loss.backward() 139 | 140 | self.optimizer.step() 141 | 142 | print(f'Epoch {epoch}: Loss {loss.item()}') 143 | 144 | if epoch % self.args.verbose == 0: 145 | val_acc = self.verbose(data) 146 | if val_acc > best: 147 | best = val_acc 148 | cnt_wait = 0 149 | torch.save(self.model.online_encoder.state_dict(), '{}/saved_model/best_{}_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.ptb_rate, self.args.attack_type, self.args.embedder, seed)) 150 | else: 151 | cnt_wait += self.args.verbose 152 | 153 | if cnt_wait == self.args.patience: 154 | print('Early stopping!') 155 | break 156 | 157 | self.model.online_encoder.load_state_dict(torch.load('{}/saved_model/best_{}_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.ptb_rate, self.args.attack_type, self.args.embedder, seed), map_location=f'cuda:{self.args.device}')) 158 | if self.args.save_embed: 159 | self.get_embeddings(data) 160 | only_clean = True if self.args.dataset in ['squirrel', 'chameleon', 'texas', 'wisconsin', 'cornell', 'actor'] else False 161 | if self.args.task == 'node': 162 | if self.args.attack_type == 'evasive': 163 | self.eval_clean_and_evasive(data, only_clean) 164 | elif self.args.attack_type == 'poison': 165 | self.eval_poisoning(data) 166 | 167 | self.summary_result() 168 | 169 | 170 | class modeler(torch.nn.Module): 171 | def __init__(self, encoder, num_hidden: int, num_proj_hidden: int, 172 | tau: float = 0.5): 173 | super(modeler, self).__init__() 174 | self.online_encoder = encoder 175 | self.tau: float = tau 176 | 177 | self.fc1 = torch.nn.Linear(num_hidden, num_proj_hidden) 178 | self.fc2 = torch.nn.Linear(num_proj_hidden, num_hidden) 179 | 180 | def forward(self, x: torch.Tensor, 181 | edge_index: torch.Tensor) -> torch.Tensor: 182 | return self.online_encoder(x, edge_index) 183 | 184 | def projection(self, z: torch.Tensor) -> torch.Tensor: 185 | z = F.elu(self.fc1(z)) 186 | return self.fc2(z) 187 | 188 | def sim(self, z1: torch.Tensor, z2: torch.Tensor): 189 | z1 = F.normalize(z1) 190 | z2 = F.normalize(z2) 191 | return torch.mm(z1, z2.t()) 192 | 193 | def semi_loss(self, z1: torch.Tensor, z2: torch.Tensor): 194 | f = lambda x: torch.exp(x / self.tau) 195 | refl_sim = f(self.sim(z1, z1)) 196 | between_sim = f(self.sim(z1, z2)) 197 | 198 | return -torch.log( 199 | between_sim.diag() 200 | / (refl_sim.sum(1) + between_sim.sum(1) - refl_sim.diag())) 201 | 202 | def batched_semi_loss(self, z1: torch.Tensor, z2: torch.Tensor, 203 | batch_size: int): 204 | # Space complexity: O(BN) (semi_loss: O(N^2)) 205 | device = z1.device 206 | num_nodes = z1.size(0) 207 | num_batches = (num_nodes - 1) // batch_size + 1 208 | f = lambda x: torch.exp(x / self.tau) 209 | indices = torch.arange(0, num_nodes).to(device) 210 | losses = [] 211 | 212 | for i in range(num_batches): 213 | mask = indices[i * batch_size:(i + 1) * batch_size] 214 | refl_sim = f(self.sim(z1[mask], z1)) # [B, N] 215 | between_sim = f(self.sim(z1[mask], z2)) # [B, N] 216 | 217 | losses.append(-torch.log( 218 | between_sim[:, i * batch_size:(i + 1) * batch_size].diag() 219 | / (refl_sim.sum(1) + between_sim.sum(1) 220 | - refl_sim[:, i * batch_size:(i + 1) * batch_size].diag()))) 221 | 222 | return torch.cat(losses) 223 | 224 | def loss(self, z1: torch.Tensor, z2: torch.Tensor, 225 | mean: bool = True, batch_size: int = 0): 226 | h1 = self.projection(z1) 227 | h2 = self.projection(z2) 228 | 229 | if batch_size == 0: 230 | l1 = self.semi_loss(h1, h2) 231 | l2 = self.semi_loss(h2, h1) 232 | else: 233 | l1 = self.batched_semi_loss(h1, h2, batch_size) 234 | l2 = self.batched_semi_loss(h2, h1, batch_size) 235 | 236 | ret = (l1 + l2) * 0.5 237 | ret = ret.mean() if mean else ret.sum() 238 | 239 | return ret 240 | -------------------------------------------------------------------------------- /models/__init__.py: -------------------------------------------------------------------------------- 1 | from .SPAGCL_node import SPAGCL_node 2 | from .SPAGCL_link import SPAGCL_link 3 | -------------------------------------------------------------------------------- /sh/clustering.sh: -------------------------------------------------------------------------------- 1 | for embedder in SPAGCL_node 2 | do 3 | for dataset in pubmed cs 4 | do 5 | for ptb_rate in 0.0 0.05 0.1 0.15 0.2 0.25 6 | do 7 | python clustering.py --embedder $embedder --dataset $dataset --task clustering --attack meta --ptb_rate $ptb_rate 8 | done 9 | done 10 | done 11 | -------------------------------------------------------------------------------- /sh/hetero_node.sh: -------------------------------------------------------------------------------- 1 | ########## general ############ 2 | device=0 3 | seed_n=10 4 | epochs=1000 5 | embedder=SPAGCL_node 6 | 7 | ## chameleon 8 | attack=random 9 | dataset=chameleon 10 | n_subgraph=3000 11 | lr=0.01 12 | wd=0.00001 13 | d_1=0.3 14 | d_2=0.2 15 | d_3=0.0 16 | add_edge_rate=0.3 17 | drop_feat_rate=0.1 18 | knn=10 19 | tau=0.4 20 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 21 | 22 | 23 | ## squirrel 24 | attack=random 25 | dataset=squirrel 26 | n_subgraph=3000 27 | lr=0.01 28 | wd=0.00001 29 | d_1=0.3 30 | d_2=0.2 31 | d_3=0.0 32 | add_edge_rate=0.1 33 | drop_feat_rate=0.3 34 | knn=10 35 | tau=0.4 36 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 37 | 38 | 39 | # Actor 40 | attack=random 41 | dataset=actor 42 | n_subgraph=3000 43 | lr=0.01 44 | wd=0.00001 45 | d_1=0.1 46 | d_2=0.1 47 | d_3=0.0 48 | add_edge_rate=0.3 49 | drop_feat_rate=0.3 50 | knn=10 51 | tau=0.4 52 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 53 | 54 | ## texas 55 | attack=random 56 | dataset=texas 57 | n_subgraph=3000 58 | lr=0.05 59 | wd=0.00001 60 | d_1=0.5 61 | d_2=0.5 62 | d_3=0.0 63 | add_edge_rate=0.7 64 | drop_feat_rate=0.9 65 | knn=10 66 | tau=0.4 67 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 68 | 69 | 70 | ## cornell 71 | attack=random 72 | dataset=cornell 73 | n_subgraph=3000 74 | lr=0.05 75 | wd=0.00001 76 | d_1=0.4 77 | d_2=0.3 78 | d_3=0.0 79 | add_edge_rate=0.7 80 | drop_feat_rate=0.5 81 | knn=10 82 | tau=0.4 83 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 84 | 85 | ## wisconsin 86 | attack=random 87 | dataset=wisconsin 88 | n_subgraph=3000 89 | lr=0.05 90 | wd=0.00001 91 | d_1=0.2 92 | d_2=0.4 93 | d_3=0.0 94 | add_edge_rate=0.7 95 | drop_feat_rate=0.0 96 | knn=10 97 | tau=0.2 98 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 99 | -------------------------------------------------------------------------------- /sh/link.sh: -------------------------------------------------------------------------------- 1 | ########## general ############ 2 | device=1 3 | seed_n=3 4 | epochs=1000 5 | embedder=SPAGCL_link 6 | verbose=100 7 | 8 | ########## cora ############## 9 | dataset=cora 10 | attack=meta 11 | n_subgraph=3000 12 | lr=0.005 13 | wd=0.01 14 | d_1=0.2 15 | d_2=0.3 16 | d_3=0.0 17 | add_edge_rate=0.5 18 | drop_feat_rate=0.7 19 | knn=10 20 | tau=0.4 21 | python main.py --embedder $embedder --dataset $dataset --task link --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 22 | for p in 0.05 0.1 0.15 0.2 0.25 23 | do 24 | python main.py --embedder $embedder --dataset $dataset --task link --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 25 | done 26 | 27 | ########## citeseer ############## 28 | dataset=citeseer 29 | attack=meta 30 | n_subgraph=3000 31 | lr=0.01 32 | wd=0.00001 33 | d_1=0.2 34 | d_2=0.1 35 | d_3=0.0 36 | add_edge_rate=0.1 37 | drop_feat_rate=0.9 38 | knn=10 39 | tau=0.6 40 | python main.py --embedder $embedder --dataset $dataset --task link --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 41 | for p in 0.05 0.1 0.15 0.2 0.25 42 | do 43 | python main.py --embedder $embedder --dataset $dataset --task link --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 44 | done 45 | -------------------------------------------------------------------------------- /sh/metattack_maker.sh: -------------------------------------------------------------------------------- 1 | # photo 2 | dataset=photo 3 | sub_size=3000 4 | for ((i=1; i<=100; i++)) 5 | do 6 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 6 --sub_size 3000 --ptb_n 300 --seed $i 7 | done 8 | 9 | # computers 10 | dataset=computers 11 | sub_size=3000 12 | for ((i=91; i<=200; i++)) 13 | do 14 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 7 --sub_size 3000 --ptb_n 300 --seed $i 15 | done 16 | 17 | # cs 18 | dataset=cs 19 | sub_size=3000 20 | for ((i=1; i<=100; i++)) 21 | do 22 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 7 --sub_size 3000 --ptb_n 200 --seed $i 23 | done 24 | 25 | # physics 26 | dataset=physics 27 | sub_size=3000 28 | for ((i=81; i<=300; i++)) 29 | do 30 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 6 --sub_size 3000 --ptb_n 200 --seed $i 31 | done 32 | -------------------------------------------------------------------------------- /sh/node.sh: -------------------------------------------------------------------------------- 1 | ########## general ############ 2 | device=1 3 | seed_n=10 4 | epochs=1000 5 | embedder=SPAGCL_node 6 | 7 | ########## cora ############## 8 | dataset=cora 9 | attack=meta 10 | n_subgraph=3000 11 | lr=0.005 12 | wd=0.01 13 | d_1=0.2 14 | d_2=0.3 15 | d_3=0.0 16 | add_edge_rate=0.5 17 | drop_feat_rate=0.7 18 | knn=10 19 | tau=0.4 20 | l1=5.0 21 | l2=3.0 22 | python main.py --embedder $embedder --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2 23 | 24 | l1=5.0 25 | l2=3.0 26 | p=0.05 27 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2 28 | 29 | l1=4.0 30 | l2=4.0 31 | p=0.1 32 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2 33 | 34 | l1=4.0 35 | l2=2.0 36 | p=0.15 37 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2 38 | 39 | l1=0.1 40 | l2=5.0 41 | p=0.2 42 | add_edge_rate=0.5 43 | drop_feat_rate=0.7 44 | d_1=0.1 45 | d_2=0.1 46 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2 47 | 48 | 49 | l1=0.5 50 | l2=5.0 51 | p=0.25 52 | add_edge_rate=0.5 53 | drop_feat_rate=0.3 54 | d_1=0.1 55 | d_2=0.2 56 | lr=0.01 57 | tau=0.4 58 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2 59 | 60 | 61 | dataset=cora 62 | attack=nettack 63 | n_subgraph=3000 64 | lr=0.01 65 | wd=0.0001 66 | d_1=0.1 67 | d_2=0.3 68 | d_3=0.0 69 | add_edge_rate=0.7 70 | drop_feat_rate=0.7 71 | knn=10 72 | tau=0.4 73 | python main.py --embedder $embedder --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node 74 | for p in 1 2 3 4 5 75 | do 76 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node 77 | done 78 | 79 | attack=random 80 | n_subgraph=3000 81 | lr=0.01 82 | wd=0.01 83 | d_1=0.2 84 | d_2=0.1 85 | d_3=0.0 86 | add_edge_rate=0.1 87 | drop_feat_rate=0.5 88 | knn=10 89 | tau=0.4 90 | python main.py --embedder $embedder --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node 91 | for p in 0.2 0.4 0.6 0.8 1.0 92 | do 93 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node 94 | done 95 | 96 | 97 | ########## citeseer ############## 98 | dataset=citeseer 99 | attack=meta 100 | n_subgraph=3000 101 | lr=0.01 102 | wd=0.00001 103 | d_1=0.2 104 | d_2=0.1 105 | d_3=0.0 106 | add_edge_rate=0.1 107 | drop_feat_rate=0.9 108 | knn=10 109 | tau=0.6 110 | l1=4.0 111 | l2=3.0 112 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 113 | 114 | 115 | l1=2.0 116 | l2=5.0 117 | p=0.05 118 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 119 | 120 | l1=2.0 121 | l2=2.0 122 | p=0.1 123 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 124 | 125 | 126 | l1=2.0 127 | l2=2.0 128 | p=0.15 129 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 130 | 131 | 132 | l1=4.0 133 | l2=5.0 134 | p=0.2 135 | add_edge_rate=0.7 136 | drop_feat_rate=0.9 137 | d_1=0.2 138 | d_2=0.1 139 | lr=0.01 140 | tau=0.6 141 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 142 | 143 | 144 | l1=2.0 145 | l2=5.0 146 | p=0.25 147 | add_edge_rate=0.9 148 | drop_feat_rate=0.7 149 | d_1=0.2 150 | d_2=0.1 151 | lr=0.01 152 | tau=0.8 153 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 154 | 155 | 156 | attack=nettack 157 | n_subgraph=3000 158 | lr=0.001 159 | wd=0.01 160 | d_1=0.1 161 | d_2=0.1 162 | d_3=0.0 163 | add_edge_rate=0.7 164 | drop_feat_rate=0.9 165 | knn=5 166 | tau=0.6 167 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 168 | for p in 1 2 3 4 5 169 | do 170 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 171 | done 172 | 173 | attack=random 174 | n_subgraph=3000 175 | lr=0.001 176 | wd=0.00001 177 | d_1=0.1 178 | d_2=0.4 179 | d_3=0.0 180 | add_edge_rate=0.5 181 | drop_feat_rate=0.7 182 | knn=5 183 | tau=0.8 184 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 185 | for p in 0.2 0.4 0.6 0.8 1.0 186 | do 187 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 188 | done 189 | 190 | 191 | ########## pubmed ############## 192 | dataset=pubmed 193 | attack=meta 194 | n_subgraph=1000 195 | lr=0.001 196 | wd=0.00001 197 | d_1=0.3 198 | d_2=0.2 199 | d_3=0.0 200 | add_edge_rate=0.5 201 | drop_feat_rate=0.7 202 | knn=10 203 | tau=0.4 204 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 205 | for p in 0.05 0.1 0.15 0.2 0.25 206 | do 207 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 208 | done 209 | 210 | attack=nettack 211 | n_subgraph=1000 212 | lr=0.005 213 | wd=0.01 214 | d_1=0.5 215 | d_2=0.5 216 | d_3=0.0 217 | add_edge_rate=0.5 218 | drop_feat_rate=0.0 219 | knn=10 220 | tau=0.8 221 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 222 | for p in 1 2 3 4 5 223 | do 224 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 225 | done 226 | 227 | attack=random 228 | n_subgraph=1000 229 | lr=0.001 230 | wd=0.00001 231 | d_1=0.4 232 | d_2=0.1 233 | d_3=0.0 234 | add_edge_rate=0.3 235 | drop_feat_rate=0.0 236 | knn=10 237 | tau=0.4 238 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 239 | for p in 0.2 0.4 0.6 0.8 1.0 240 | do 241 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 242 | done 243 | 244 | 245 | ## photo 246 | attack=meta 247 | dataset=photo 248 | n_subgraph=5000 249 | lr=0.01 250 | wd=0.00001 251 | d_1=0.3 252 | d_2=0.2 253 | d_3=0.0 254 | add_edge_rate=0.1 255 | drop_feat_rate=0.0 256 | knn=10 257 | tau=0.4 258 | seed_n=10 259 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 260 | for p in 0.05 0.1 0.15 0.2 0.25 261 | do 262 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 263 | done 264 | 265 | 266 | ## computers 267 | attack=meta 268 | dataset=computers 269 | n_subgraph=5000 270 | lr=0.01 271 | wd=0.01 272 | d_1=0.3 273 | d_2=0.2 274 | d_3=0.0 275 | add_edge_rate=0.1 276 | drop_feat_rate=0.0 277 | knn=10 278 | tau=0.4 279 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 280 | for p in 0.05 0.1 0.15 0.2 0.25 281 | do 282 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 283 | done 284 | 285 | ## cs 286 | attack=meta 287 | dataset=cs 288 | n_subgraph=5000 289 | lr=0.01 290 | wd=0.001 291 | d_1=0.3 292 | d_2=0.2 293 | d_3=0.0 294 | add_edge_rate=0.7 295 | drop_feat_rate=0.0 296 | knn=10 297 | tau=0.4 298 | 299 | l1=0.5 300 | l2=5.0 301 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 302 | 303 | p=0.05 304 | l1=0.5 305 | l2=2.0 306 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 307 | 308 | p=0.1 309 | l1=0.1 310 | l2=5.0 311 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 312 | 313 | p=0.15 314 | l1=0.5 315 | l2=5.0 316 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 317 | 318 | p=0.2 319 | l1=0.5 320 | l2=5.0 321 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 322 | 323 | p=0.25 324 | l1=0.5 325 | l2=5.0 326 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2 327 | 328 | ## physics 329 | attack=meta 330 | dataset=physics 331 | n_subgraph=5000 332 | lr=0.01 333 | wd=0.01 334 | d_1=0.3 335 | d_2=0.2 336 | d_3=0.0 337 | add_edge_rate=0.1 338 | drop_feat_rate=0.0 339 | knn=10 340 | tau=0.4 341 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 342 | for p in 0.05 0.1 0.15 0.2 0.25 343 | do 344 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau 345 | done 346 | -------------------------------------------------------------------------------- /sh/save_emb.sh: -------------------------------------------------------------------------------- 1 | seed_n=10 2 | attack=meta 3 | device=0 4 | for dataset in cora citeseer pubmed photo computers cs physics 5 | do 6 | for embedder in SPAGCL_node 7 | do 8 | python main.py --embedder $embedder --dataset $dataset --task node --save_embed --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs 0 9 | for p in 0.05 0.1 0.15 0.2 0.25 10 | do 11 | python main.py --embedder $embedder --dataset $dataset --task node --save_embed --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs 0 12 | done 13 | done 14 | done -------------------------------------------------------------------------------- /utils/__init__.py: -------------------------------------------------------------------------------- 1 | from .data import get_data 2 | from .utils import * 3 | from .transforms import get_graph_drop_transform -------------------------------------------------------------------------------- /utils/create_lp_data.py: -------------------------------------------------------------------------------- 1 | # Modified from https://github.com/pcy1302/asp2vec/blob/master/src/create_dataset.py 2 | import pickle as pkl 3 | import copy 4 | import random 5 | import networkx as nx 6 | import numpy 7 | import os 8 | import sys 9 | import argparse 10 | import pandas as pd 11 | import pdb 12 | import sys 13 | # sys.path.append('../') 14 | from data import get_data 15 | from torch_geometric.utils import to_networkx, from_networkx 16 | import torch 17 | os.getcwd() 18 | 19 | random.seed(1995) 20 | 21 | def parse_args(): 22 | # input arguments 23 | parser = argparse.ArgumentParser(description='preprocess') 24 | 25 | parser.add_argument('--dataset', type=str, default='cora') 26 | parser.add_argument('--remove_percent', type=float, default=0.5) 27 | 28 | return parser.parse_known_args() 29 | 30 | def LargestSubgraph(graph): 31 | """Returns the Largest connected-component of `graph`.""" 32 | if graph.__class__ == nx.Graph: 33 | return LargestUndirectedSubgraph(graph) 34 | elif graph.__class__ == nx.DiGraph: 35 | largest_undirected_cc = LargestUndirectedSubgraph(nx.Graph(graph)) 36 | directed_subgraph = nx.DiGraph() 37 | for (n1, n2) in graph.edges(): 38 | if n2 in largest_undirected_cc and n1 in largest_undirected_cc[n2]: 39 | directed_subgraph.add_edge(n1, n2) 40 | 41 | return directed_subgraph 42 | 43 | def connected_component_subgraphs(G): 44 | for c in nx.connected_components(G): 45 | yield G.subgraph(c) 46 | 47 | def LargestUndirectedSubgraph(graph): 48 | """Returns the largest connected-component of undirected `graph`.""" 49 | if nx.is_connected(graph): 50 | return graph 51 | 52 | # cc = list(nx.connected_component_subgraphs(graph)) 53 | cc = list(connected_component_subgraphs(graph)) 54 | sizes = [len(c) for c in cc] 55 | max_idx = sizes.index(max(sizes)) 56 | return cc[max_idx] 57 | 58 | # return sizes_and_cc[-1][1] 59 | 60 | 61 | def SampleTestEdgesAndPruneGraph(graph, remove_percent, check_every=5): 62 | """Removes and returns `remove_percent` of edges from graph. 63 | Removal is random but makes sure graph stays connected.""" 64 | graph = copy.deepcopy(graph) 65 | undirected_graph = graph.to_undirected() 66 | 67 | edges = list(copy.deepcopy(graph.edges())) 68 | random.shuffle(edges) 69 | remove_edges = int(len(edges) * remove_percent) 70 | num_edges_removed = 0 71 | currently_removing_edges = [] 72 | removed_edges = [] 73 | last_printed_prune_percentage = -1 74 | for j in range(len(edges)): 75 | n1, n2 = edges[j] 76 | graph.remove_edge(n1, n2) 77 | if n1 not in graph[n2]: 78 | undirected_graph.remove_edge(*(edges[j])) 79 | currently_removing_edges.append(edges[j]) 80 | if j % check_every == 0: 81 | if nx.is_connected(undirected_graph): 82 | num_edges_removed += check_every 83 | removed_edges += currently_removing_edges 84 | currently_removing_edges = [] 85 | else: 86 | for i in range(check_every): 87 | graph.add_edge(*(edges[j - i])) 88 | undirected_graph.add_edge(*(edges[j - i])) 89 | currently_removing_edges = [] 90 | if not nx.is_connected(undirected_graph): 91 | print (' DID NOT RECOVER :(') 92 | return None 93 | prunned_percentage = int(100 * len(removed_edges) / remove_edges) 94 | rounded = (prunned_percentage / 10) * 10 95 | if rounded != last_printed_prune_percentage: 96 | last_printed_prune_percentage = rounded 97 | # print ('Partitioning into train/test. Progress=%i%%' % rounded) 98 | 99 | if len(removed_edges) >= remove_edges: 100 | break 101 | 102 | return graph, removed_edges 103 | 104 | def SampleNegativeEdges(graph, num_edges): 105 | """Samples `num_edges` edges from compliment of `graph`.""" 106 | random_negatives = set() 107 | nodes = list(graph.nodes()) 108 | while len(random_negatives) < num_edges: 109 | i1 = random.randint(0, len(nodes) - 1) 110 | i2 = random.randint(0, len(nodes) - 1) 111 | if i1 == i2: 112 | continue 113 | if i1 > i2: 114 | i1, i2 = i2, i1 115 | n1 = nodes[i1] 116 | n2 = nodes[i2] 117 | if graph.has_edge(n1, n2): 118 | continue 119 | random_negatives.add((n1, n2)) 120 | 121 | return random_negatives 122 | 123 | 124 | def RandomNegativesPerNode(graph, test_nodes_PerNode, negatives_per_node=499): 125 | """For every node u in graph, samples 20 (u, v) where v is not in graph[u].""" 126 | node_list = list(graph.nodes()) 127 | num_nodes = len(node_list) 128 | for n in test_nodes_PerNode: 129 | found_negatives = 0 130 | while found_negatives < negatives_per_node: 131 | n2 = node_list[random.randint(0, num_nodes - 1)] 132 | if n == n2 or n2 in graph[n]: 133 | continue 134 | test_nodes_PerNode[n].append(n2) 135 | found_negatives += 1 136 | 137 | return test_nodes_PerNode 138 | 139 | 140 | def NumberNodes(graph): 141 | """Returns a copy of `graph` where nodes are replaced by incremental ints.""" 142 | node_list = sorted(graph.nodes()) 143 | index = {n: i for (i, n) in enumerate(node_list)} 144 | 145 | newgraph = graph.__class__() 146 | for (n1, n2) in graph.edges(): 147 | newgraph.add_edge(index[n1], index[n2]) 148 | 149 | return newgraph, index 150 | 151 | 152 | 153 | def MakeDirectedNegatives(positive_edges): 154 | positive_set = set([(u, v) for (u, v) in list(positive_edges)]) 155 | directed_negatives = [] 156 | for (u, v) in positive_set: 157 | if (v, u) not in positive_set: 158 | directed_negatives.append((v, u)) 159 | return numpy.array(directed_negatives, dtype='int32') 160 | 161 | def CreateDatasetFiles(graph, output_dir, remove_percent, partition=True): 162 | """Writes a number of dataset files to `output_dir`. 163 | Args: 164 | graph: nx.Graph or nx.DiGraph to simulate walks on and extract negatives. 165 | output_dir: files will be written in this directory, including: 166 | {train, train.neg, test, test.neg}.txt.npy, index.pkl, and 167 | if flag --directed is set, test.directed.neg.txt.npy. 168 | The files {train, train.neg}.txt.npy are used for model selection; 169 | {test, test.neg, test.directed.neg}.txt.npy will be used for calculating 170 | eval metrics; index.pkl contains information about the graph (# of nodes, 171 | mapping from original graph IDs to new assigned integer ones in 172 | [0, largest_cc_size-1]. 173 | partition: If set largest connected component will be used and data will 174 | separated into train/test splits. 175 | Returns: 176 | The training graph, after node renumbering. 177 | """ 178 | 179 | if not os.path.exists(output_dir): 180 | os.makedirs(output_dir) 181 | 182 | original_size = len(graph) 183 | if partition: 184 | graph = LargestSubgraph(graph) 185 | size_largest_cc = len(graph) 186 | else: 187 | size_largest_cc = -1 188 | graph, index = NumberNodes(graph) 189 | 190 | if partition: 191 | print("Generate dataset for link prediction") 192 | # For link prediction (50%:50%) 193 | train_graph, test_edges = SampleTestEdgesAndPruneGraph(graph, remove_percent) 194 | 195 | else: 196 | train_graph, test_edges = graph, [] 197 | 198 | assert len(graph) == len(train_graph) 199 | 200 | # Sample negatives, to be equal to number of `test_edges` * 2. 201 | random_negatives = list(SampleNegativeEdges(graph, len(test_edges) + len(train_graph.edges()))) 202 | random.shuffle(random_negatives) 203 | test_negatives = random_negatives[:len(test_edges)] 204 | # These are only used for evaluation, never training. 205 | train_eval_negatives = random_negatives[len(test_edges):] 206 | 207 | test_negatives = torch.from_numpy(numpy.array(test_negatives, dtype='int32')).long() 208 | test_edges = torch.from_numpy(numpy.array(test_edges, dtype='int32')).long() 209 | train_edges = torch.from_numpy(numpy.array(train_graph.edges(), dtype='int32')).long() 210 | train_eval_negatives = torch.from_numpy(numpy.array(train_eval_negatives, dtype='int32')).long() 211 | 212 | train_label = torch.cat((torch.ones(train_edges.size(0)), torch.zeros(train_eval_negatives.size(0)))) 213 | test_label = torch.cat((torch.ones(test_edges.size(0)), torch.zeros(test_negatives.size(0)))) 214 | 215 | data = {'train_edges': train_edges, 'train_edges_neg': train_eval_negatives, 'train_label': train_label, 216 | 'test_edges': test_edges, 'test_edges_neg': test_negatives, 'test_label': test_label} 217 | 218 | print("Size of train_edges: {}".format(len(train_edges))) 219 | print("Size of train_eval_negatives: {}".format(len(train_eval_negatives))) 220 | print("Size of test_edges: {}".format(len(test_edges))) 221 | print("Size of test_edges_neg: {}".format(len(test_negatives))) 222 | 223 | return data 224 | 225 | def main(): 226 | args, unknown = parse_args() 227 | print(args) 228 | 229 | folder_dataset = f'./dataset/link' 230 | 231 | data = get_data('./dataset/', args.dataset, attack='meta', ptb_rate=0.0)[0] 232 | graph = to_networkx(data) 233 | 234 | # Create dataset files. 235 | print("Create {} dataset (Remove percent: {})".format(args.dataset, args.remove_percent)) 236 | data_dict = CreateDatasetFiles(graph, folder_dataset, args.remove_percent) 237 | 238 | torch.save(data_dict, f'{folder_dataset}/{args.dataset}_link.pt') 239 | 240 | 241 | if __name__ == '__main__': 242 | main() 243 | -------------------------------------------------------------------------------- /utils/data.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | from torch_geometric.data import Data 4 | from torch_geometric.utils import to_undirected, dense_to_sparse 5 | from deeprobust.graph.data import Dataset, PrePtbDataset 6 | import json 7 | import scipy.sparse as sp 8 | from deeprobust.graph.global_attack import Random 9 | 10 | def get_data(root, name, attack, ptb_rate): 11 | if name in ['cora', 'citeseer', 'pubmed']: 12 | data = Dataset(root=root, name=name, setting='prognn') 13 | adj, features, labels = data.adj, data.features, data.labels 14 | idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test 15 | 16 | dataset = Data() 17 | 18 | dataset.x = torch.from_numpy(features.toarray()).float() 19 | dataset.y = torch.from_numpy(labels).long() 20 | dataset.edge_index = dense_to_sparse(torch.from_numpy(adj.toarray()))[0].long() 21 | 22 | dataset.train_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_train)).bool() 23 | dataset.val_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_val)).bool() 24 | dataset.test_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_test)).bool() 25 | 26 | if attack == 'meta': 27 | if ptb_rate == 0.0: 28 | return [dataset] 29 | else: 30 | perturbed_data = PrePtbDataset(root=root, 31 | name=name, 32 | attack_method=attack, 33 | ptb_rate=ptb_rate) 34 | perturbed_adj = perturbed_data.adj 35 | dataset.edge_index = dense_to_sparse(torch.from_numpy(perturbed_adj.toarray()))[0].long() 36 | return [dataset] 37 | elif attack == 'nettack': 38 | if ptb_rate == 0.0: 39 | with open(f'{root}{name}_nettacked_nodes.json') as json_file: 40 | ptb_idx = json.load(json_file) 41 | idx_test_att = ptb_idx['attacked_test_nodes'] 42 | dataset.test_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_test_att)).bool() 43 | else: 44 | perturbed_adj = sp.load_npz(f'{root}{name}_nettack_adj_{int(ptb_rate)}.0.npz') 45 | with open(f'{root}{name}_nettacked_nodes.json') as json_file: 46 | ptb_idx = json.load(json_file) 47 | 48 | dataset.edge_index = dense_to_sparse(torch.from_numpy(perturbed_adj.toarray()))[0].long() 49 | idx_test_att = ptb_idx['attacked_test_nodes'] 50 | dataset.test_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_test_att)).bool() 51 | 52 | return [dataset] 53 | 54 | elif attack == 'random': 55 | if ptb_rate == 0.0: 56 | return [dataset] 57 | attacker = Random() 58 | n_perturbations = int(ptb_rate * (dataset.edge_index.shape[1]//2)) 59 | attacker.attack(adj, n_perturbations, type='add') 60 | perturbed_adj = attacker.modified_adj 61 | dataset.edge_index = dense_to_sparse(torch.from_numpy(perturbed_adj.toarray()))[0].long() 62 | return [dataset] 63 | 64 | else: 65 | if name in ['photo', 'computers', 'cs', 'physics']: 66 | from torch_geometric.datasets import Planetoid, Amazon, Coauthor 67 | # if name in ['cora_pyg', 'citeseer_pyg']: 68 | # data = Planetoid(root=root, name=name.split('_')[0])[0] 69 | if name in ['photo', 'computers']: 70 | data = Amazon(root=root, name=name)[0] 71 | if ptb_rate > 0: 72 | edge = torch.load(f'{root}{name}_{attack}_adj_{ptb_rate}.pt').cpu() 73 | data.edge_index = edge 74 | else: 75 | data.edge_index = to_undirected(data.edge_index) 76 | data = create_masks(data) 77 | elif name in ['cs', 'physics']: 78 | data = Coauthor(root=root, name=name)[0] 79 | if ptb_rate > 0: 80 | edge = torch.load(f'{root}{name}_{attack}_adj_{ptb_rate}.pt').cpu() 81 | data.edge_index = edge 82 | else: 83 | data.edge_index = to_undirected(data.edge_index) 84 | data = create_masks(data) 85 | 86 | elif name in ['squirrel', 'chameleon', 'actor', 'cornell', 'wisconsin', 'texas']: 87 | from torch_geometric.datasets import WikipediaNetwork, Actor, WebKB 88 | if name in ['squirrel', 'chameleon']: 89 | data = WikipediaNetwork(root=root, name=name)[0] 90 | data.edge_index = to_undirected(data.edge_index) 91 | data = create_masks(data) 92 | if name in ['actor']: 93 | data = Actor(root=root)[0] 94 | data.edge_index = to_undirected(data.edge_index) 95 | data = create_masks(data) 96 | if name in ['cornell', 'wisconsin', 'texas']: 97 | data = WebKB(root=root, name=name)[0] 98 | data.edge_index = to_undirected(data.edge_index) 99 | data = create_masks(data) 100 | 101 | return [data] 102 | 103 | 104 | def create_masks(data): 105 | """ 106 | Splits data into training, validation, and test splits in a stratified manner if 107 | it is not already splitted. Each split is associated with a mask vector, which 108 | specifies the indices for that split. The data will be modified in-place 109 | :param data: Data object 110 | :return: The modified data 111 | """ 112 | tr = 0.1 113 | vl = 0.1 114 | tst = 0.8 115 | if not hasattr(data, "val_mask"): 116 | _train_mask = _val_mask = _test_mask = None 117 | 118 | for i in range(20): 119 | labels = data.y.numpy() 120 | dev_size = int(labels.shape[0] * vl) 121 | test_size = int(labels.shape[0] * tst) 122 | 123 | perm = np.random.permutation(labels.shape[0]) 124 | test_index = perm[:test_size] 125 | dev_index = perm[test_size:test_size + dev_size] 126 | 127 | data_index = np.arange(labels.shape[0]) 128 | test_mask = torch.tensor(np.in1d(data_index, test_index), dtype=torch.bool) 129 | dev_mask = torch.tensor(np.in1d(data_index, dev_index), dtype=torch.bool) 130 | train_mask = ~(dev_mask + test_mask) 131 | test_mask = test_mask.reshape(1, -1) 132 | dev_mask = dev_mask.reshape(1, -1) 133 | train_mask = train_mask.reshape(1, -1) 134 | 135 | if _train_mask is None: 136 | _train_mask = train_mask 137 | _val_mask = dev_mask 138 | _test_mask = test_mask 139 | 140 | else: 141 | _train_mask = torch.cat((_train_mask, train_mask), dim=0) 142 | _val_mask = torch.cat((_val_mask, dev_mask), dim=0) 143 | _test_mask = torch.cat((_test_mask, test_mask), dim=0) 144 | 145 | data.train_mask = _train_mask.squeeze() 146 | data.val_mask = _val_mask.squeeze() 147 | data.test_mask = _test_mask.squeeze() 148 | 149 | elif hasattr(data, "val_mask") and len(data.val_mask.shape) == 1: 150 | data.train_mask = data.train_mask.T 151 | data.val_mask = data.val_mask.T 152 | data.test_mask = data.test_mask.T 153 | 154 | else: 155 | num_folds = torch.min(torch.tensor(data.train_mask.size())).item() 156 | data.train_mask = data.train_mask.T 157 | data.val_mask = data.val_mask.T 158 | if len(data.test_mask.size()) == 1: 159 | data.test_mask = data.test_mask.unsqueeze(0).expand(num_folds, -1) 160 | else: 161 | data.test_mask = data.test_mask.T 162 | 163 | return data 164 | 165 | -------------------------------------------------------------------------------- /utils/transforms.py: -------------------------------------------------------------------------------- 1 | import copy 2 | 3 | import torch 4 | from torch_geometric.utils.dropout import dropout_adj 5 | from torch_geometric.utils import add_self_loops 6 | from torch_geometric.transforms import Compose 7 | 8 | 9 | class DropFeatures: 10 | r"""Drops node features with probability p.""" 11 | def __init__(self, p=None, precomputed_weights=True): 12 | assert 0. < p < 1., 'Dropout probability has to be between 0 and 1, but got %.2f' % p 13 | self.p = p 14 | 15 | def __call__(self, data): 16 | drop_mask = torch.empty((data.x.size(1),), dtype=torch.float32, device=data.x.device).uniform_(0, 1) < self.p 17 | data.x[:, drop_mask] = 0 18 | return data 19 | 20 | def __repr__(self): 21 | return '{}(p={})'.format(self.__class__.__name__, self.p) 22 | 23 | 24 | class DropEdges: 25 | r"""Drops edges with probability p.""" 26 | def __init__(self, p, force_undirected=False): 27 | assert 0. < p < 1., 'Dropout probability has to be between 0 and 1, but got %.2f' % p 28 | 29 | self.p = p 30 | self.force_undirected = force_undirected 31 | 32 | def __call__(self, data): 33 | edge_index = data.edge_index 34 | edge_attr = data.edge_attr if 'edge_attr' in data else None 35 | 36 | edge_index, edge_attr = dropout_adj(edge_index, edge_attr, p=self.p, force_undirected=self.force_undirected) 37 | # edge_index = add_self_loops(edge_index) 38 | 39 | data.edge_index = edge_index 40 | if edge_attr is not None: 41 | data.edge_attr = edge_attr 42 | return data 43 | 44 | def __repr__(self): 45 | return '{}(p={}, force_undirected={})'.format(self.__class__.__name__, self.p, self.force_undirected) 46 | 47 | 48 | def get_graph_drop_transform(drop_edge_p, drop_feat_p): 49 | transforms = list() 50 | 51 | # make copy of graph 52 | transforms.append(copy.deepcopy) 53 | 54 | # drop edges 55 | if drop_edge_p > 0.: 56 | transforms.append(DropEdges(drop_edge_p)) 57 | 58 | # drop features 59 | if drop_feat_p > 0.: 60 | transforms.append(DropFeatures(drop_feat_p)) 61 | return Compose(transforms) 62 | -------------------------------------------------------------------------------- /utils/utils.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import random, os 3 | import numpy as np 4 | import torch.nn.functional as F 5 | from torch_geometric.utils.sparse import dense_to_sparse 6 | from torch_geometric.utils import to_dense_adj, add_self_loops 7 | 8 | def to_numpy(tensor): 9 | return tensor.detach().cpu().numpy() 10 | 11 | def dense_to_sparse_adj(edge_index, n_node): 12 | return torch.sparse.FloatTensor(edge_index, 13 | torch.ones(edge_index.shape[1]).to(edge_index.device), 14 | [n_node, n_node]) 15 | 16 | def dense_to_sparse_x(feat_index, n_node, n_dim): 17 | return torch.sparse.FloatTensor(feat_index, 18 | torch.ones(feat_index.shape[1]).to(feat_index.device), 19 | [n_node, n_dim]) 20 | 21 | def to_dense_subadj(edge_index, subsize): 22 | edge = add_self_loops(edge_index, num_nodes=subsize)[0] 23 | return to_dense_adj(edge)[0].fill_diagonal_(0.0) 24 | 25 | def set_cuda_device(device_num): 26 | if torch.cuda.is_available(): 27 | device = torch.device(f'cuda:{device_num}' if torch.cuda.is_available() else 'cpu') 28 | torch.cuda.set_device(device) 29 | 30 | def enumerateConfig(args): 31 | args_names = [] 32 | args_vals = [] 33 | for arg in vars(args): 34 | args_names.append(arg) 35 | args_vals.append(getattr(args, arg)) 36 | 37 | return args_names, args_vals 38 | 39 | def config2string(args): 40 | args_names, args_vals = enumerateConfig(args) 41 | st = '' 42 | for name, val in zip(args_names, args_vals): 43 | if val == False: 44 | continue 45 | 46 | if name not in ['device', 'patience', 'epochs', 'save_dir', 'in_dim', 'n_class', 'best_epoch', 'save_fig', 'n_node', 'n_degree', 'attack', 'attack_type', 'ptb_rate', 'verbose', 'mm', '']: 47 | st_ = "{}:{} / ".format(name, val) 48 | st += st_ 49 | 50 | 51 | return st[:-1] 52 | 53 | def set_everything(seed=42): 54 | random.seed(seed) 55 | os.environ['PYTHONHASHSEED'] = str(seed) 56 | np.random.seed(seed) 57 | torch.manual_seed(seed) 58 | torch.cuda.manual_seed_all(seed) 59 | torch.backends.cudnn.deterministic = True 60 | # torch.autograd.set_detect_anomaly(True) 61 | torch.backends.cudnn.benchmark = False 62 | os.environ['CUDA_LAUNCH_BLOCKING'] = '1' # specify GPUs locally 63 | 64 | def ensure_dir(file_path): 65 | directory = os.path.dirname(file_path) 66 | if not os.path.exists(directory): 67 | os.makedirs(directory) 68 | --------------------------------------------------------------------------------