├── README.md
├── attack_maker
├── generate_metattack.py
└── generate_ptb.py
├── clustering.py
├── create_env.sh
├── embedder.py
├── encoder
├── __init__.py
└── gnn.py
├── figs
├── overall_architecure.jpg
└── overall_architecure.pdf
├── main.py
├── models
├── SPAGCL_link.py
├── SPAGCL_node.py
└── __init__.py
├── sh
├── clustering.sh
├── hetero_node.sh
├── link.sh
├── metattack_maker.sh
├── node.sh
└── save_emb.sh
└── utils
├── __init__.py
├── create_lp_data.py
├── data.py
├── transforms.py
└── utils.py
/README.md:
--------------------------------------------------------------------------------
1 | # Similarity-Preserving Adversarial Graph Contrastive Learning (SP-AGCL)
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
11 | The official source code for [**Similarity-Preserving Adversarial Graph Contrastive Learning**](https://arxiv.org/abs/2306.13854) at KDD 2023.
12 |
13 | Yeonjun In*, [Kanghoon Yoon*](https://kanghoonyoon.github.io/), and [Chanyoung Park](http://dsail.kaist.ac.kr/professor/)
14 |
15 | ## Abstract
16 | Adversarial attacks on a graph refer to imperceptible perturbations on the graph structure and node features, and it is well known that GNN models are vulnerable to such attacks. Among various GNN models, graph contrastive learning (GCL) based methods specifically suffer from adversarial attacks due to their inherent design that highly depends on the self-supervision signals derived from the original graph, which however already contains noise when the graph is attacked. Existing adversarial GCL methods adopt the adversarial training (AT) to the GCL framework to address adversarial attacks on graphs. By considering the attacked graph as an augmentation under the GCL framework, they achieve the adversarial robustness against graph structural attacks. However, we find that existing adversarially trained GCL methods achieve robustness at the expense of not being able to preserve the node similarity in terms of the node features, which is an unexpected consequence of applying AT to GCL models. In this paper, we propose a similarity-preserving adversarial graph contrastive learning (SP-AGCL) framework that contrasts the clean graph with two auxiliary views of different properties (i.e., the node similarity-preserving view and the adversarial view). Extensive experiments demonstrate that SP-AGCL achieves a competitive performance on several downstream tasks, and shows its effectiveness in various scenarios, e.g., a network with adversarial attacks, noisy labels, and heterophilous neighbors.
17 |
18 | ## Overall Archicteture
19 |
20 |
21 |
22 |
23 |
24 | ### Requirements
25 | * Python version: 3.7.11
26 | * Pytorch version: 1.10.2
27 | * torch-geometric version: 2.0.3
28 | * deeprobust
29 |
30 | ### How to Run
31 | * To run node classification (reproduce Table 1 in paper, Table 2 and 3 in appendix)
32 |
33 | ```
34 | sh sh/node.sh
35 | ```
36 | * To run link prediction (reproduce Figure 3(b) in paper)
37 |
38 | ```
39 | sh sh/link.sh
40 | ```
41 | * To run node clustering (reproduce Figure 3(c) in paper)
42 | * You should run node classification before node clustering since we use the embeddings learned in node classification.
43 |
44 | ```
45 | sh sh/save_emb.sh # save node embedding from the best model of node classification
46 | sh sh/clustering.sh
47 | ```
48 | * To run node classification on heterophilious network (reproduce Table 2 in paper)
49 |
50 | ```
51 | sh sh/hetero_node.sh
52 | ```
53 |
54 | ### Cite (Bibtex)
55 | - If you find ``SP-AGCL`` useful in your research, please cite the following paper:
56 | - Yeonjun In, Kanghoon Yoon, and Chanyoung Park. "Similarity Preserving Adversarial Graph Contrastive Learning." KDD 2023.
57 | - Bibtex
58 | ```
59 | @article{in2023similarity,
60 | title={Similarity Preserving Adversarial Graph Contrastive Learning},
61 | author={In, Yeonjun and Yoon, Kanghoon and Park, Chanyoung},
62 | journal={arXiv preprint arXiv:2306.13854},
63 | year={2023}
64 | }
65 | ```
66 |
67 |
--------------------------------------------------------------------------------
/attack_maker/generate_metattack.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import os
3 | import sys
4 | import numpy as np
5 | import torch.nn.functional as F
6 | import torch.optim as optim
7 | import scipy.sparse as sp
8 | from deeprobust.graph.defense import GCN
9 | from deeprobust.graph.global_attack import MetaApprox, Metattack
10 | from deeprobust.graph.utils import *
11 | from deeprobust.graph.data import Dataset
12 | import argparse
13 | from utils import get_data, set_everything, set_cuda_device
14 | from torch_geometric.utils import to_dense_adj
15 |
16 | parser = argparse.ArgumentParser()
17 | parser.add_argument('--device', default=6, type=int)
18 | parser.add_argument('--sub_size', default=3000, type=int)
19 | parser.add_argument('--seed', type=int, default=0, help='Random seed.')
20 | parser.add_argument('--lr', type=float, default=0.01,
21 | help='Initial learning rate.')
22 | parser.add_argument('--weight_decay', type=float, default=5e-4,
23 | help='Weight decay (L2 loss on parameters).')
24 | parser.add_argument('--hidden', type=int, default=16,
25 | help='Number of hidden units.')
26 | parser.add_argument('--dropout', type=float, default=0.5,
27 | help='Dropout rate (1 - keep probability).')
28 | parser.add_argument('--dataset', type=str, default='cs', choices=['cora', 'citeseer', 'photo', 'computers', 'cs', 'physics'], help='dataset')
29 | parser.add_argument('--ptb_rate', type=float, default=0.05, help='pertubation rate', choices=[0.05, 0.1, 0.15, 0.2, 0.25])
30 | parser.add_argument('--ptb_n', type=int, default=200)
31 | parser.add_argument('--model', type=str, default='Meta-Self', choices=['A-Meta-Self', 'Meta-Self', 'Meta-Train','A-Meta-Train'], help='model variant')
32 |
33 | args = parser.parse_args()
34 | set_everything(args.seed)
35 |
36 | save_dict = {}
37 |
38 | set_cuda_device(args.device)
39 | device = f'cuda:{args.device}'
40 | data_home = f'./dataset/'
41 | data = get_data(data_home, args.dataset, 'meta', 0.0)[0]
42 | adj_np = to_dense_adj(data.edge_index)[0].numpy().astype(np.float32)
43 | adj = sp.csr_matrix(adj_np)
44 |
45 | features = data.x.numpy().astype(np.float32)
46 | features = sp.csr_matrix(features)
47 | labels = data.y.numpy()
48 | idx_train = data.train_mask[0, :].nonzero().flatten().numpy()
49 | idx_val = data.val_mask[0, :].nonzero().flatten().numpy()
50 | idx_test = data.test_mask[0, :].nonzero().flatten().numpy()
51 | idx_unlabeled = np.union1d(idx_val, idx_test)
52 |
53 | # for seed in range(int(args.ptb_rate*100)):
54 |
55 | nodes = torch.randperm(data.x.size(0))[:args.sub_size].sort()[0].numpy()
56 | save_dict['nodes'] = nodes
57 | sub_adj = adj[nodes, :][:, nodes]
58 | sub_x = features[nodes, :]
59 | sub_y = labels[nodes]
60 |
61 | sub_idx_train = np.sort(np.in1d(nodes, idx_train).nonzero()[0])
62 | sub_idx_val = np.sort(np.in1d(nodes, idx_val).nonzero()[0])
63 | sub_idx_test = np.sort(np.in1d(nodes, idx_test).nonzero()[0])
64 | sub_idx_unlabeled = np.sort(np.in1d(nodes, idx_unlabeled).nonzero()[0])
65 |
66 | perturbations = args.ptb_n ##int(0.01 * (sub_adj.sum()//2))
67 | sub_adj, sub_x, sub_y = preprocess(sub_adj, sub_x, sub_y, preprocess_adj=False)
68 |
69 | save_dict['clean'] = sub_adj
70 | # Setup Surrogate Model
71 | surrogate = GCN(nfeat=sub_x.shape[1], nclass=sub_y.max().item()+1, nhid=16,
72 | dropout=0.5, with_relu=False, with_bias=True, weight_decay=5e-4, device=device)
73 |
74 | surrogate = surrogate.to(device)
75 | surrogate.fit(sub_x, sub_adj, sub_y, sub_idx_train)
76 |
77 | # Setup Attack Model
78 | if 'Self' in args.model:
79 | lambda_ = 0
80 | if 'Train' in args.model:
81 | lambda_ = 1
82 | if 'Both' in args.model:
83 | lambda_ = 0.5
84 |
85 | if 'A' in args.model:
86 | model = MetaApprox(model=surrogate, nnodes=sub_adj.shape[0], feature_shape=sub_x.shape, attack_structure=True, attack_features=False, device=device, lambda_=lambda_)
87 |
88 | else:
89 | model = Metattack(model=surrogate, nnodes=sub_adj.shape[0], feature_shape=sub_x.shape, attack_structure=True, attack_features=False, device=device, lambda_=lambda_)
90 |
91 | model = model.to(device)
92 |
93 | def test(adj):
94 | ''' test on GCN '''
95 |
96 | # adj = normalize_adj_tensor(adj)
97 | gcn = GCN(nfeat=sub_x.shape[1],
98 | nhid=args.hidden,
99 | nclass=sub_y.max().item() + 1,
100 | dropout=args.dropout, device=device)
101 | gcn = gcn.to(device)
102 | gcn.fit(sub_x, sub_adj, sub_y, sub_idx_train) # train without model picking
103 | # gcn.fit(features, adj, labels, idx_train, idx_val) # train with validation model picking
104 | output = gcn.output.cpu()
105 | loss_test = F.nll_loss(output[sub_idx_test], sub_y[sub_idx_test])
106 | acc_test = accuracy(output[sub_idx_test], sub_y[sub_idx_test])
107 | print("Test set results:",
108 | "loss= {:.4f}".format(loss_test.item()),
109 | "accuracy= {:.4f}".format(acc_test.item()))
110 |
111 | return acc_test.item()
112 |
113 |
114 | def main():
115 | model.attack(sub_x, sub_adj, sub_y, sub_idx_train, sub_idx_unlabeled, perturbations, ll_constraint=False)
116 | print('=== testing GCN on original(clean) graph ===')
117 | test(adj)
118 | modified_adj = model.modified_adj
119 | save_dict['modified'] = modified_adj
120 | # modified_features = model.modified_features
121 | test(modified_adj)
122 |
123 | # if you want to save the modified adj/features, uncomment the code below
124 | # model.save_adj(root='./perturbed_graph', name='{}_meta_adj_{}_{}'.format(args.dataset, args.ptb_rate, args.seed))
125 | # torch.save(nodes, f'./perturbed_graph/{args.dataset}_meta_{args.ptb_rate}_subnodes_{args.seed}.pt')
126 | # model.save_features(root='./', name='mod_features')
127 | torch.save(save_dict, f'./perturbed_graph/{args.dataset}_meta_seed{args.seed}.pt')
128 | if __name__ == '__main__':
129 | main()
130 |
--------------------------------------------------------------------------------
/attack_maker/generate_ptb.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import os
3 | import sys
4 | import numpy as np
5 | import torch.nn.functional as F
6 | import torch.optim as optim
7 | import scipy.sparse as sp
8 | from deeprobust.graph.defense import GCN
9 | from deeprobust.graph.global_attack import MetaApprox, Metattack
10 | from deeprobust.graph.utils import *
11 | from deeprobust.graph.data import Dataset
12 | import argparse
13 | from utils import get_data, set_everything, set_cuda_device
14 | from torch_geometric.utils import to_dense_adj, coalesce, to_undirected
15 | import pandas as pd
16 |
17 | '''
18 | photo 6000, 12000, 18000, 24000, 30000
19 | computers 12000, 24000, 36000, 48000, 60000
20 | cs 4000, 8000, 12000, 16000, 20000
21 | physics 12000, 24000, 36000, 48000, 60000
22 | '''
23 |
24 | dic = {'photo':30000//300, 'computers':60000//300, 'cs':20000//200, 'physics':60000//200}
25 |
26 | parser = argparse.ArgumentParser()
27 | parser.add_argument('--dataset', default='computers', type=str)
28 | parser.add_argument('--ptb_rate', default=0.05, type=float)
29 |
30 | args = parser.parse_args()
31 |
32 | data_home = f'./dataset/'
33 | data = get_data(data_home, args.dataset, 'meta', 0.0)[0]
34 | adj = to_dense_adj(data.edge_index)[0]
35 | adj_ = adj.clone()
36 | folders = os.listdir('perturbed_graph')
37 | folders = sorted([f for f in folders if args.dataset in f])
38 | idx = int(dic[args.dataset] / (0.25/args.ptb_rate))
39 | adjs = []
40 | for f in folders[:idx]:
41 | tmp = torch.load(f'perturbed_graph/{f}', map_location='cpu')
42 | idx2nodes = {i:n for i, n in enumerate(tmp['nodes'])}
43 | print(f, (tmp['modified'].cpu()>tmp['clean']).sum(), (tmp['modified'].cpu()= 2
49 | self.gcn_module = gcn_module
50 | self.input_size, self.representation_size = layer_sizes[0], layer_sizes[-1]
51 | self.weight_standardization = weight_standardization
52 |
53 | total_layers = []
54 | for in_dim, out_dim in zip(layer_sizes[:-1], layer_sizes[1:]):
55 | layers = []
56 | layers.append((self.gcn_module(in_dim, out_dim), 'x, edge_index -> x'),)
57 |
58 | if batchnorm:
59 | layers.append(BatchNorm(out_dim, momentum=batchnorm_mm))
60 | # else:
61 | # layers.append(LayerNorm(out_dim))
62 |
63 | layers.append(nn.PReLU())
64 | total_layers.append(Sequential('x, edge_index', layers))
65 |
66 | self.model = nn.ModuleList(total_layers)
67 | # self.model = Sequential('x, edge_index, perturb', layers)
68 |
69 | def forward(self, x, adj, perturb=None):
70 | if self.weight_standardization:
71 | self.standardize_weights()
72 |
73 | for i, layer in enumerate(self.model):
74 | x = layer(x, adj)
75 | if perturb is not None and i==0:
76 | x += perturb
77 | return x
78 |
79 | def reset_parameters(self):
80 | for m in self.model:
81 | m.reset_parameters()
82 |
83 | def standardize_weights(self):
84 | skipped_first_conv = False
85 | for m in self.model.modules():
86 | if isinstance(m, self.gcn_module):
87 | if not skipped_first_conv:
88 | skipped_first_conv = True
89 | continue
90 | weight = m.lin.weight.data
91 | var, mean = torch.var_mean(weight, dim=1, keepdim=True)
92 | weight = (weight - mean) / (torch.sqrt(var + 1e-5))
93 | m.lin.weight.data = weight
--------------------------------------------------------------------------------
/figs/overall_architecure.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yeonjun-in/torch-SP-AGCL/e829366bfaec5306ab7b436d5c2d6feba194e47b/figs/overall_architecure.jpg
--------------------------------------------------------------------------------
/figs/overall_architecure.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/yeonjun-in/torch-SP-AGCL/e829366bfaec5306ab7b436d5c2d6feba194e47b/figs/overall_architecure.pdf
--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import argparse
3 | from utils import set_everything
4 | import warnings
5 | warnings.filterwarnings("ignore")
6 |
7 | def parse_args():
8 | # input arguments
9 | set_everything(1995)
10 | parser = argparse.ArgumentParser()
11 |
12 | parser.add_argument('--embedder', default='SPAGCL_node')
13 | parser.add_argument('--dataset', default='cora', choices=['cora', 'citeseer', 'pubmed', 'photo', 'computers', 'cs', 'physics', 'chameleon', 'squirrel', 'actor', 'texas', 'wisconsin', 'cornell'])
14 | parser.add_argument('--task', default='node', choices=['clustering', 'node', 'link'])
15 | parser.add_argument('--attack', type=str, default='meta', choices=['meta', 'nettack', 'random', 'feat_gau', 'feat_bern'])
16 | parser.add_argument('--attack_type', type=str, default='poison', choices=['poison', 'evasive'])
17 | if parser.parse_known_args()[0].attack_type in ['poison']:
18 | parser.add_argument('--ptb_rate', type=float, default=0.0)
19 |
20 | parser.add_argument('--seed_n', default=3, type=int)
21 | parser.add_argument('--epochs', type=int, default=1000)
22 | parser.add_argument("--layers", nargs='*', type=int, default=[512, 128], help="The number of units of each layer of the GNN. Default is [256]")
23 |
24 | parser.add_argument('--lr', type = float, default = 0.001)
25 | parser.add_argument('--wd', type=float, default=1e-5)
26 |
27 | parser.add_argument("--save_embed", action='store_true', default=False)
28 |
29 | parser.add_argument('--lambda_1', type=float, default=2.0)
30 | parser.add_argument('--lambda_2', type=float, default=2.0)
31 |
32 | parser.add_argument('--d_1', type=float, default=0.3)
33 | parser.add_argument('--d_2', type=float, default=0.2)
34 | parser.add_argument('--d_3', type=float, default=0.0)
35 | parser.add_argument("--bn", action='store_false', default=True)
36 | parser.add_argument('--warmup', type=int, default=0)
37 | parser.add_argument('--sub_size', type=int, default=5000)
38 | parser.add_argument('--add_edge_rate', type=float, default=0.3)
39 | parser.add_argument('--drop_feat_rate', type=float, default=0.3)
40 | parser.add_argument('--knn', type=int, default=10)
41 |
42 | parser.add_argument('--tau', type=float, default=0.4)
43 | parser.add_argument('--device', type=int, default=0)
44 | parser.add_argument('--patience', type=int, default=400)
45 | parser.add_argument('--verbose', type=int, default=10)
46 |
47 | parser.add_argument('--save_dir', type=str, default='./results')
48 | parser.add_argument('--save_fig', action='store_true', default=True)
49 |
50 | return parser.parse_known_args()
51 |
52 |
53 | def main():
54 | args, _ = parse_args()
55 | args.drop_edge_rate = args.add_edge_rate
56 |
57 | assert ~(args.attack_type == 'poison' and args.ptb_rate == 0.0)
58 | if args.attack_type == 'evasive':
59 | args.ptb_rate = 0.0
60 | if '_link' in args.embedder:
61 | args.task = 'link'
62 |
63 | torch.cuda.set_device(args.device)
64 |
65 | if args.embedder == 'SPAGCL_node':
66 | from models import SPAGCL_node
67 | embedder = SPAGCL_node(args)
68 | if args.embedder == 'SPAGCL_link':
69 | from models import SPAGCL_link
70 | embedder = SPAGCL_link(args)
71 |
72 | embedder.training()
73 |
74 | if __name__ == '__main__':
75 | main()
76 |
--------------------------------------------------------------------------------
/models/SPAGCL_link.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import torch
3 | from torch.optim import AdamW
4 | from embedder import embedder
5 | from encoder import GCN, GCNLayer
6 | from utils import get_graph_drop_transform, set_everything, dense_to_sparse_x, to_dense_subadj, get_data
7 | from copy import deepcopy
8 | from collections import defaultdict
9 | from torch_geometric.utils import to_undirected, to_dense_adj, dense_to_sparse, subgraph, add_self_loops, coalesce
10 | import torch.nn.functional as F
11 | from torch_geometric.data import Data
12 | import networkx as nx
13 |
14 | from utils.utils import to_dense_subadj
15 |
16 | class SPAGCL_link(embedder):
17 | def __init__(self, args):
18 | embedder.__init__(self, args)
19 | self.args = args
20 |
21 | def attack_adj(self, x1, x2, n_edge):
22 | n_nodes = len(x1.x)
23 | add_edge_num = int(self.args.add_edge_rate * n_edge)
24 | drop_edge_num = int(self.args.drop_edge_rate * n_edge)
25 | grad_sum = x1.edge_adj.grad + x2.edge_adj.grad
26 | grad_sum_1d = grad_sum.view(-1)
27 | values, indices = grad_sum_1d.sort()
28 | add_idx, drop_idx = indices[-add_edge_num:], indices[:drop_edge_num]
29 |
30 | add_idx_dense = torch.stack([add_idx // n_nodes, add_idx % n_nodes])
31 | drop_idx_dense = torch.stack([drop_idx // n_nodes, drop_idx % n_nodes])
32 |
33 | add = to_dense_adj(add_idx_dense, max_num_nodes=n_nodes)[0]
34 | drop = to_dense_adj(drop_idx_dense, max_num_nodes=n_nodes)[0]
35 | return add, 1-drop
36 |
37 | def attack_feat(self, x1, x2):
38 | n_nodes, n_dim = x1.x.size()
39 | drop_feat_num = int((n_dim * self.args.drop_feat_rate) * n_nodes)
40 | grad_sum = x1.x.grad + x2.x.grad
41 | grad_sum_1d = grad_sum.view(-1)
42 | values, indices = grad_sum_1d.sort()
43 |
44 | drop_idx = indices[:drop_feat_num]
45 |
46 | drop_idx_dense = torch.stack([drop_idx // n_dim, drop_idx % n_dim])
47 |
48 | drop_sparse = dense_to_sparse_x(drop_idx_dense, n_nodes, n_dim)
49 | return 1-drop_sparse.to_dense()
50 |
51 | def training(self):
52 |
53 | self.train_result, self.val_result, self.test_result = defaultdict(list), defaultdict(list), defaultdict(list)
54 | self.best_epochs = []
55 | for seed in range(self.args.seed_n):
56 | self.seed = seed
57 | set_everything(seed)
58 |
59 | data = self.data.clone()
60 |
61 | link_data = torch.load(f'dataset/link/{self.args.dataset}_link.pt')
62 | train_pos, train_neg, train_label = link_data['train_edges'].T, link_data['train_edges_neg'].T, link_data['train_label'].T
63 | test_pos, test_neg, test_label = link_data['test_edges'].T, link_data['test_edges_neg'].T, link_data['test_label'].T
64 |
65 | edge_index = train_pos.clone()
66 | train_edge = torch.cat((train_pos, train_neg), dim=1)
67 | test_edge = torch.cat((test_pos, test_neg), dim=1)
68 |
69 | if self.args.ptb_rate > 0.0:
70 | clean_data = get_data(self.data_home, self.args.dataset, self.args.attack, 0.0)[0]
71 | clean_adj = to_dense_adj(clean_data.edge_index)[0]
72 | ptb_adj = to_dense_adj(self.data.edge_index)[0]
73 | ptb_edge = (ptb_adj > clean_adj).nonzero().T
74 |
75 | edge_index = coalesce(torch.cat((edge_index, ptb_edge), dim=1))
76 | train_edge = torch.cat((train_edge, ptb_edge), dim=1)
77 | train_label = torch.cat((train_label, torch.ones(ptb_edge.size(1))))
78 |
79 | data.edge_index = edge_index
80 | data.train_edge_index, data.val_edge_index, data.test_edge_index = train_edge, test_edge, test_edge
81 | data.train_label, data.val_label, data.test_label = train_label, test_label, test_label
82 |
83 | knn_data = Data()
84 | sim = F.normalize(data.x).mm(F.normalize(data.x).T).fill_diagonal_(0.0)
85 | dst = sim.topk(self.args.knn, 1)[1]
86 | src = torch.arange(data.x.size(0)).unsqueeze(1).expand_as(sim.topk(self.args.knn, 1)[1])
87 | edge_index = torch.stack([src.reshape(-1), dst.reshape(-1)])
88 | edge_index = to_undirected(edge_index)
89 | knn_data.x = deepcopy(data.x)
90 | knn_data.edge_index = edge_index
91 | data = data.cuda()
92 | knn_data = knn_data.cuda()
93 |
94 | data.edge_adj = to_dense_adj(data.edge_index, max_num_nodes=data.x.size(0))[0].to_sparse()
95 |
96 | transform_1 = get_graph_drop_transform(drop_edge_p=self.args.d_1, drop_feat_p=self.args.d_1)
97 | transform_2 = get_graph_drop_transform(drop_edge_p=self.args.d_2, drop_feat_p=self.args.d_2)
98 | transform_3 = get_graph_drop_transform(drop_edge_p=self.args.d_3, drop_feat_p=self.args.d_3)
99 |
100 | self.encoder = GCN(GCNLayer, [self.args.in_dim] + self.args.layers, batchnorm=self.args.bn) # 512, 256, 128
101 | self.model = modeler(self.encoder, self.args.layers[-1], self.args.layers[-1], self.args.tau).cuda()
102 | self.optimizer = AdamW(self.model.parameters(), lr=self.args.lr, weight_decay=self.args.wd)
103 |
104 | best, best_epochs, cnt_wait = 0, 0, 0
105 | for epoch in range(1, self.args.epochs+1):
106 |
107 | sub1, sub2 = self.subgraph_sampling(data, knn_data)
108 | self.model.train()
109 | self.optimizer.zero_grad()
110 |
111 | x1, x2, x_knn, x_adv = transform_1(sub1), transform_2(sub1), transform_3(sub2), deepcopy(sub1)
112 | x1.edge_adj, x2.edge_adj = to_dense_subadj(x1.edge_index, self.sample_size), to_dense_subadj(x2.edge_index, self.sample_size)
113 | x_knn.edge_adj, x_adv.edge_adj = to_dense_subadj(x_knn.edge_index, self.sample_size), to_dense_subadj(x_adv.edge_index, self.sample_size)
114 |
115 | if epoch > self.args.warmup:
116 | x1.edge_adj = x1.edge_adj.requires_grad_()
117 | x2.edge_adj = x2.edge_adj.requires_grad_()
118 |
119 | x1.x = x1.x.requires_grad_()
120 | x2.x = x2.x.requires_grad_()
121 |
122 | z1 = self.model(x1.x, x1.edge_adj)
123 | z2 = self.model(x2.x, x2.edge_adj)
124 | loss = self.model.loss(z1, z2, batch_size=0)
125 |
126 | loss.backward()
127 |
128 | if epoch > self.args.warmup:
129 | n_edge = int(x1.edge_adj.sum().item())
130 | add_edge, masking_edge = self.attack_adj(x1, x2, n_edge=n_edge)
131 | masking_feat = self.attack_feat(x1, x2)
132 |
133 | x1.edge_adj, x2.edge_adj = x1.edge_adj.detach(), x2.edge_adj.detach()
134 | x_adv.edge_adj = ((x1.edge_adj*masking_edge) + add_edge*1.0).clamp(0, 1).detach()
135 |
136 | x1.x, x2.x = x1.x.detach(), x2.x.detach()
137 | x_adv.x = (x1.x*masking_feat).detach()
138 |
139 | x_knn.x = x_knn.x.detach()
140 | x_knn.edge_adj = x_knn.edge_adj.detach()
141 |
142 | z1 = self.model(x1.x, x1.edge_adj.to_sparse())
143 | z2 = self.model(x2.x, x2.edge_adj.to_sparse())
144 | z_adv = self.model(x_adv.x, x_adv.edge_adj.to_sparse())
145 | z_knn = self.model(x_knn.x, x_knn.edge_adj.to_sparse())
146 |
147 | loss = self.model.loss(z1, z2, batch_size=0)*0.5
148 | loss += self.model.loss(z1, z_knn, batch_size=0)
149 | loss += self.model.loss(z1, z_adv, batch_size=0)
150 |
151 | self.optimizer.zero_grad()
152 | loss.backward()
153 |
154 | self.optimizer.step()
155 |
156 | print(f'Epoch {epoch}: Loss {loss.item()}')
157 |
158 | if epoch % self.args.verbose == 0:
159 | _, val_acc, _ = self.verbose_link(data)
160 | if val_acc > best:
161 | best = val_acc
162 | cnt_wait = 0
163 | best_epochs = epoch
164 | torch.save(self.model.online_encoder.state_dict(), '{}/saved_model/best_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.attack_type, self.args.embedder, seed))
165 | else:
166 | cnt_wait += self.args.verbose
167 |
168 | if cnt_wait == self.args.patience:
169 | print('Early stopping!')
170 | break
171 |
172 | self.best_epochs.append(best_epochs)
173 | self.model.online_encoder.load_state_dict(torch.load('{}/saved_model/best_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.attack_type, self.args.embedder, seed), map_location=f'cuda:{self.args.device}'))
174 | only_clean = True if self.args.dataset in ['photo', 'computers', 'cs', 'physics', 'amz', 'amz2', 'squirrel', 'chameleon',] else False
175 | self.eval_link(data)
176 |
177 | self.summary_result()
178 |
179 | def subgraph_sampling(self, data1, data2):
180 | self.sample_size = min(self.args.sub_size, self.args.n_node)
181 | nodes = torch.randperm(data1.x.size(0))[:self.sample_size].sort()[0]
182 | edge1, edge2 = add_self_loops(data1.edge_index, num_nodes=data1.x.size(0))[0], add_self_loops(data2.edge_index, num_nodes=data1.x.size(0))[0]
183 | edge1 = subgraph(subset=nodes, edge_index=edge1, relabel_nodes=True)[0]
184 | edge2 = subgraph(subset=nodes, edge_index=edge2, relabel_nodes=True)[0]
185 |
186 | tmp1, tmp2 = Data(), Data()
187 | tmp1.x, tmp2.x = data1.x[nodes], data2.x[nodes]
188 | tmp1.edge_index, tmp2.edge_index = edge1, edge2
189 |
190 | return tmp1, tmp2
191 |
192 | class modeler(torch.nn.Module):
193 | def __init__(self, encoder, num_hidden: int, num_proj_hidden: int,
194 | tau: float = 0.5):
195 | super(modeler, self).__init__()
196 | self.online_encoder = encoder
197 | self.tau: float = tau
198 |
199 | self.fc1 = torch.nn.Linear(num_hidden, num_proj_hidden)
200 | self.fc2 = torch.nn.Linear(num_proj_hidden, num_hidden)
201 |
202 | def forward(self, x: torch.Tensor,
203 | edge_index: torch.Tensor) -> torch.Tensor:
204 | return self.online_encoder(x, edge_index)
205 |
206 | def projection(self, z: torch.Tensor) -> torch.Tensor:
207 | z = F.elu(self.fc1(z))
208 | return self.fc2(z)
209 |
210 | def sim(self, z1: torch.Tensor, z2: torch.Tensor):
211 | z1 = F.normalize(z1)
212 | z2 = F.normalize(z2)
213 | return torch.mm(z1, z2.t())
214 |
215 | def semi_loss(self, z1: torch.Tensor, z2: torch.Tensor):
216 | f = lambda x: torch.exp(x / self.tau)
217 | refl_sim = f(self.sim(z1, z1))
218 | between_sim = f(self.sim(z1, z2))
219 |
220 | return -torch.log(
221 | between_sim.diag()
222 | / (refl_sim.sum(1) + between_sim.sum(1) - refl_sim.diag()))
223 |
224 | def batched_semi_loss(self, z1: torch.Tensor, z2: torch.Tensor,
225 | batch_size: int):
226 | # Space complexity: O(BN) (semi_loss: O(N^2))
227 | device = z1.device
228 | num_nodes = z1.size(0)
229 | num_batches = (num_nodes - 1) // batch_size + 1
230 | f = lambda x: torch.exp(x / self.tau)
231 | indices = torch.arange(0, num_nodes).to(device)
232 | losses = []
233 |
234 | for i in range(num_batches):
235 | mask = indices[i * batch_size:(i + 1) * batch_size]
236 | refl_sim = f(self.sim(z1[mask], z1)) # [B, N]
237 | between_sim = f(self.sim(z1[mask], z2)) # [B, N]
238 |
239 | losses.append(-torch.log(
240 | between_sim[:, i * batch_size:(i + 1) * batch_size].diag()
241 | / (refl_sim.sum(1) + between_sim.sum(1)
242 | - refl_sim[:, i * batch_size:(i + 1) * batch_size].diag())))
243 |
244 | return torch.cat(losses)
245 |
246 | def loss(self, z1: torch.Tensor, z2: torch.Tensor,
247 | mean: bool = True, batch_size: int = 0):
248 | h1 = self.projection(z1)
249 | h2 = self.projection(z2)
250 |
251 | if batch_size == 0:
252 | l1 = self.semi_loss(h1, h2)
253 | l2 = self.semi_loss(h2, h1)
254 | else:
255 | l1 = self.batched_semi_loss(h1, h2, batch_size)
256 | l2 = self.batched_semi_loss(h2, h1, batch_size)
257 |
258 | ret = (l1 + l2) * 0.5
259 | ret = ret.mean() if mean else ret.sum()
260 |
261 | return ret
262 |
--------------------------------------------------------------------------------
/models/SPAGCL_node.py:
--------------------------------------------------------------------------------
1 | import torch
2 | from torch.optim import AdamW
3 | from embedder import embedder
4 | from encoder import GCN, GCNLayer
5 | from utils import get_graph_drop_transform, set_everything, dense_to_sparse_x, to_dense_subadj
6 | from copy import deepcopy
7 | from collections import defaultdict
8 | from torch_geometric.utils import to_undirected, to_dense_adj, dense_to_sparse, subgraph, add_self_loops
9 | import torch.nn.functional as F
10 | from torch_geometric.data import Data
11 | from utils.utils import to_dense_subadj
12 |
13 | class SPAGCL_node(embedder):
14 | def __init__(self, args):
15 | embedder.__init__(self, args)
16 | self.args = args
17 |
18 | def attack_adj(self, x1, x2, n_edge):
19 | n_nodes = len(x1.x)
20 | add_edge_num = int(self.args.add_edge_rate * n_edge)
21 | drop_edge_num = int(self.args.drop_edge_rate * n_edge)
22 | grad_sum = x1.edge_adj.grad + x2.edge_adj.grad
23 | grad_sum_1d = grad_sum.view(-1)
24 | values, indices = grad_sum_1d.sort()
25 | add_idx, drop_idx = indices[-add_edge_num:], indices[:drop_edge_num]
26 |
27 | add_idx_dense = torch.stack([add_idx // n_nodes, add_idx % n_nodes])
28 | drop_idx_dense = torch.stack([drop_idx // n_nodes, drop_idx % n_nodes])
29 |
30 | add = to_dense_adj(add_idx_dense, max_num_nodes=n_nodes)[0]
31 | drop = to_dense_adj(drop_idx_dense, max_num_nodes=n_nodes)[0]
32 | return add, 1-drop
33 |
34 | def attack_feat(self, x1, x2):
35 | n_nodes, n_dim = x1.x.size()
36 | drop_feat_num = int((n_dim * self.args.drop_feat_rate) * n_nodes)
37 | grad_sum = x1.x.grad + x2.x.grad
38 | grad_sum_1d = grad_sum.view(-1)
39 | values, indices = grad_sum_1d.sort()
40 |
41 | drop_idx = indices[:drop_feat_num]
42 |
43 | drop_idx_dense = torch.stack([drop_idx // n_dim, drop_idx % n_dim])
44 |
45 | drop_sparse = dense_to_sparse_x(drop_idx_dense, n_nodes, n_dim)
46 | return 1-drop_sparse.to_dense()
47 |
48 | def subgraph_sampling(self, data1, data2):
49 | self.sample_size = min(self.args.sub_size, self.args.n_node)
50 | nodes = torch.randperm(data1.x.size(0))[:self.sample_size].sort()[0]
51 | edge1, edge2 = add_self_loops(data1.edge_index, num_nodes=data1.x.size(0))[0], add_self_loops(data2.edge_index, num_nodes=data1.x.size(0))[0]
52 | edge1 = subgraph(subset=nodes, edge_index=edge1, relabel_nodes=True)[0]
53 | edge2 = subgraph(subset=nodes, edge_index=edge2, relabel_nodes=True)[0]
54 |
55 | tmp1, tmp2 = Data(), Data()
56 | tmp1.x, tmp2.x = data1.x[nodes], data2.x[nodes]
57 | tmp1.edge_index, tmp2.edge_index = edge1, edge2
58 |
59 | return tmp1, tmp2
60 |
61 | def training(self):
62 |
63 | self.train_result, self.val_result, self.test_result = defaultdict(list), defaultdict(list), defaultdict(list)
64 | for seed in range(self.args.seed_n):
65 | self.seed = seed
66 | set_everything(seed)
67 |
68 | data = self.data.clone()
69 |
70 | knn_data = Data()
71 | sim = F.normalize(data.x).mm(F.normalize(data.x).T).fill_diagonal_(0.0)
72 | dst = sim.topk(self.args.knn, 1)[1]
73 | src = torch.arange(data.x.size(0)).unsqueeze(1).expand_as(sim.topk(self.args.knn, 1)[1])
74 | edge_index = torch.stack([src.reshape(-1), dst.reshape(-1)])
75 | edge_index = to_undirected(edge_index)
76 | knn_data.x = deepcopy(data.x)
77 | knn_data.edge_index = edge_index
78 | data = data.cuda()
79 | knn_data = knn_data.cuda()
80 |
81 | data.edge_adj = to_dense_adj(data.edge_index, max_num_nodes=data.x.size(0))[0].to_sparse()
82 |
83 | transform_1 = get_graph_drop_transform(drop_edge_p=self.args.d_1, drop_feat_p=self.args.d_1)
84 | transform_2 = get_graph_drop_transform(drop_edge_p=self.args.d_2, drop_feat_p=self.args.d_2)
85 | transform_3 = get_graph_drop_transform(drop_edge_p=self.args.d_3, drop_feat_p=self.args.d_3)
86 |
87 | self.encoder = GCN(GCNLayer, [self.args.in_dim] + self.args.layers, batchnorm=self.args.bn) # 512, 256, 128
88 | self.model = modeler(self.encoder, self.args.layers[-1], self.args.layers[-1], self.args.tau).cuda()
89 | self.optimizer = AdamW(self.model.parameters(), lr=self.args.lr, weight_decay=self.args.wd)
90 |
91 | best, cnt_wait = 0, 0
92 | for epoch in range(1, self.args.epochs+1):
93 |
94 | sub1, sub2 = self.subgraph_sampling(data, knn_data)
95 | self.model.train()
96 | self.optimizer.zero_grad()
97 |
98 | x1, x2, x_knn, x_adv = transform_1(sub1), transform_2(sub1), transform_3(sub2), deepcopy(sub1)
99 | x1.edge_adj, x2.edge_adj = to_dense_subadj(x1.edge_index, self.sample_size), to_dense_subadj(x2.edge_index, self.sample_size)
100 | x_knn.edge_adj, x_adv.edge_adj = to_dense_subadj(x_knn.edge_index, self.sample_size), to_dense_subadj(x_adv.edge_index, self.sample_size)
101 |
102 | if epoch > self.args.warmup:
103 | x1.edge_adj = x1.edge_adj.requires_grad_()
104 | x2.edge_adj = x2.edge_adj.requires_grad_()
105 |
106 | x1.x = x1.x.requires_grad_()
107 | x2.x = x2.x.requires_grad_()
108 |
109 | z1 = self.model(x1.x, x1.edge_adj)
110 | z2 = self.model(x2.x, x2.edge_adj)
111 | loss = self.model.loss(z1, z2, batch_size=0)
112 |
113 | loss.backward()
114 |
115 | if epoch > self.args.warmup:
116 | n_edge = int(x1.edge_adj.sum().item())
117 | add_edge, masking_edge = self.attack_adj(x1, x2, n_edge=n_edge)
118 | masking_feat = self.attack_feat(x1, x2)
119 |
120 | x1.edge_adj, x2.edge_adj = x1.edge_adj.detach(), x2.edge_adj.detach()
121 | x_adv.edge_adj = ((x1.edge_adj*masking_edge) + add_edge*1.0).clamp(0, 1).detach()
122 |
123 | x1.x, x2.x = x1.x.detach(), x2.x.detach()
124 | x_adv.x = (x1.x*masking_feat).detach()
125 |
126 | x_knn.x = x_knn.x.detach()
127 | x_knn.edge_adj = x_knn.edge_adj.detach()
128 |
129 | z1 = self.model(x1.x, x1.edge_adj.to_sparse())
130 | z2 = self.model(x2.x, x2.edge_adj.to_sparse())
131 | z_adv = self.model(x_adv.x, x_adv.edge_adj.to_sparse())
132 | z_knn = self.model(x_knn.x, x_knn.edge_adj.to_sparse())
133 | loss = self.model.loss(z1, z2, batch_size=0)*0.5
134 | loss += self.model.loss(z1, z_adv, batch_size=0)*self.args.lambda_1*0.5
135 | loss += self.model.loss(z1, z_knn, batch_size=0)*self.args.lambda_2*0.5
136 | # print(self.args.lambda_1*0.5, self.args.lambda_2*0.5)
137 | self.optimizer.zero_grad()
138 | loss.backward()
139 |
140 | self.optimizer.step()
141 |
142 | print(f'Epoch {epoch}: Loss {loss.item()}')
143 |
144 | if epoch % self.args.verbose == 0:
145 | val_acc = self.verbose(data)
146 | if val_acc > best:
147 | best = val_acc
148 | cnt_wait = 0
149 | torch.save(self.model.online_encoder.state_dict(), '{}/saved_model/best_{}_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.ptb_rate, self.args.attack_type, self.args.embedder, seed))
150 | else:
151 | cnt_wait += self.args.verbose
152 |
153 | if cnt_wait == self.args.patience:
154 | print('Early stopping!')
155 | break
156 |
157 | self.model.online_encoder.load_state_dict(torch.load('{}/saved_model/best_{}_{}_{}_{}_{}_seed{}.pkl'.format(self.args.save_dir, self.args.dataset, self.args.attack, self.args.ptb_rate, self.args.attack_type, self.args.embedder, seed), map_location=f'cuda:{self.args.device}'))
158 | if self.args.save_embed:
159 | self.get_embeddings(data)
160 | only_clean = True if self.args.dataset in ['squirrel', 'chameleon', 'texas', 'wisconsin', 'cornell', 'actor'] else False
161 | if self.args.task == 'node':
162 | if self.args.attack_type == 'evasive':
163 | self.eval_clean_and_evasive(data, only_clean)
164 | elif self.args.attack_type == 'poison':
165 | self.eval_poisoning(data)
166 |
167 | self.summary_result()
168 |
169 |
170 | class modeler(torch.nn.Module):
171 | def __init__(self, encoder, num_hidden: int, num_proj_hidden: int,
172 | tau: float = 0.5):
173 | super(modeler, self).__init__()
174 | self.online_encoder = encoder
175 | self.tau: float = tau
176 |
177 | self.fc1 = torch.nn.Linear(num_hidden, num_proj_hidden)
178 | self.fc2 = torch.nn.Linear(num_proj_hidden, num_hidden)
179 |
180 | def forward(self, x: torch.Tensor,
181 | edge_index: torch.Tensor) -> torch.Tensor:
182 | return self.online_encoder(x, edge_index)
183 |
184 | def projection(self, z: torch.Tensor) -> torch.Tensor:
185 | z = F.elu(self.fc1(z))
186 | return self.fc2(z)
187 |
188 | def sim(self, z1: torch.Tensor, z2: torch.Tensor):
189 | z1 = F.normalize(z1)
190 | z2 = F.normalize(z2)
191 | return torch.mm(z1, z2.t())
192 |
193 | def semi_loss(self, z1: torch.Tensor, z2: torch.Tensor):
194 | f = lambda x: torch.exp(x / self.tau)
195 | refl_sim = f(self.sim(z1, z1))
196 | between_sim = f(self.sim(z1, z2))
197 |
198 | return -torch.log(
199 | between_sim.diag()
200 | / (refl_sim.sum(1) + between_sim.sum(1) - refl_sim.diag()))
201 |
202 | def batched_semi_loss(self, z1: torch.Tensor, z2: torch.Tensor,
203 | batch_size: int):
204 | # Space complexity: O(BN) (semi_loss: O(N^2))
205 | device = z1.device
206 | num_nodes = z1.size(0)
207 | num_batches = (num_nodes - 1) // batch_size + 1
208 | f = lambda x: torch.exp(x / self.tau)
209 | indices = torch.arange(0, num_nodes).to(device)
210 | losses = []
211 |
212 | for i in range(num_batches):
213 | mask = indices[i * batch_size:(i + 1) * batch_size]
214 | refl_sim = f(self.sim(z1[mask], z1)) # [B, N]
215 | between_sim = f(self.sim(z1[mask], z2)) # [B, N]
216 |
217 | losses.append(-torch.log(
218 | between_sim[:, i * batch_size:(i + 1) * batch_size].diag()
219 | / (refl_sim.sum(1) + between_sim.sum(1)
220 | - refl_sim[:, i * batch_size:(i + 1) * batch_size].diag())))
221 |
222 | return torch.cat(losses)
223 |
224 | def loss(self, z1: torch.Tensor, z2: torch.Tensor,
225 | mean: bool = True, batch_size: int = 0):
226 | h1 = self.projection(z1)
227 | h2 = self.projection(z2)
228 |
229 | if batch_size == 0:
230 | l1 = self.semi_loss(h1, h2)
231 | l2 = self.semi_loss(h2, h1)
232 | else:
233 | l1 = self.batched_semi_loss(h1, h2, batch_size)
234 | l2 = self.batched_semi_loss(h2, h1, batch_size)
235 |
236 | ret = (l1 + l2) * 0.5
237 | ret = ret.mean() if mean else ret.sum()
238 |
239 | return ret
240 |
--------------------------------------------------------------------------------
/models/__init__.py:
--------------------------------------------------------------------------------
1 | from .SPAGCL_node import SPAGCL_node
2 | from .SPAGCL_link import SPAGCL_link
3 |
--------------------------------------------------------------------------------
/sh/clustering.sh:
--------------------------------------------------------------------------------
1 | for embedder in SPAGCL_node
2 | do
3 | for dataset in pubmed cs
4 | do
5 | for ptb_rate in 0.0 0.05 0.1 0.15 0.2 0.25
6 | do
7 | python clustering.py --embedder $embedder --dataset $dataset --task clustering --attack meta --ptb_rate $ptb_rate
8 | done
9 | done
10 | done
11 |
--------------------------------------------------------------------------------
/sh/hetero_node.sh:
--------------------------------------------------------------------------------
1 | ########## general ############
2 | device=0
3 | seed_n=10
4 | epochs=1000
5 | embedder=SPAGCL_node
6 |
7 | ## chameleon
8 | attack=random
9 | dataset=chameleon
10 | n_subgraph=3000
11 | lr=0.01
12 | wd=0.00001
13 | d_1=0.3
14 | d_2=0.2
15 | d_3=0.0
16 | add_edge_rate=0.3
17 | drop_feat_rate=0.1
18 | knn=10
19 | tau=0.4
20 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
21 |
22 |
23 | ## squirrel
24 | attack=random
25 | dataset=squirrel
26 | n_subgraph=3000
27 | lr=0.01
28 | wd=0.00001
29 | d_1=0.3
30 | d_2=0.2
31 | d_3=0.0
32 | add_edge_rate=0.1
33 | drop_feat_rate=0.3
34 | knn=10
35 | tau=0.4
36 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
37 |
38 |
39 | # Actor
40 | attack=random
41 | dataset=actor
42 | n_subgraph=3000
43 | lr=0.01
44 | wd=0.00001
45 | d_1=0.1
46 | d_2=0.1
47 | d_3=0.0
48 | add_edge_rate=0.3
49 | drop_feat_rate=0.3
50 | knn=10
51 | tau=0.4
52 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
53 |
54 | ## texas
55 | attack=random
56 | dataset=texas
57 | n_subgraph=3000
58 | lr=0.05
59 | wd=0.00001
60 | d_1=0.5
61 | d_2=0.5
62 | d_3=0.0
63 | add_edge_rate=0.7
64 | drop_feat_rate=0.9
65 | knn=10
66 | tau=0.4
67 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
68 |
69 |
70 | ## cornell
71 | attack=random
72 | dataset=cornell
73 | n_subgraph=3000
74 | lr=0.05
75 | wd=0.00001
76 | d_1=0.4
77 | d_2=0.3
78 | d_3=0.0
79 | add_edge_rate=0.7
80 | drop_feat_rate=0.5
81 | knn=10
82 | tau=0.4
83 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
84 |
85 | ## wisconsin
86 | attack=random
87 | dataset=wisconsin
88 | n_subgraph=3000
89 | lr=0.05
90 | wd=0.00001
91 | d_1=0.2
92 | d_2=0.4
93 | d_3=0.0
94 | add_edge_rate=0.7
95 | drop_feat_rate=0.0
96 | knn=10
97 | tau=0.2
98 | python main.py --embedder $embedder --task node --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
99 |
--------------------------------------------------------------------------------
/sh/link.sh:
--------------------------------------------------------------------------------
1 | ########## general ############
2 | device=1
3 | seed_n=3
4 | epochs=1000
5 | embedder=SPAGCL_link
6 | verbose=100
7 |
8 | ########## cora ##############
9 | dataset=cora
10 | attack=meta
11 | n_subgraph=3000
12 | lr=0.005
13 | wd=0.01
14 | d_1=0.2
15 | d_2=0.3
16 | d_3=0.0
17 | add_edge_rate=0.5
18 | drop_feat_rate=0.7
19 | knn=10
20 | tau=0.4
21 | python main.py --embedder $embedder --dataset $dataset --task link --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
22 | for p in 0.05 0.1 0.15 0.2 0.25
23 | do
24 | python main.py --embedder $embedder --dataset $dataset --task link --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
25 | done
26 |
27 | ########## citeseer ##############
28 | dataset=citeseer
29 | attack=meta
30 | n_subgraph=3000
31 | lr=0.01
32 | wd=0.00001
33 | d_1=0.2
34 | d_2=0.1
35 | d_3=0.0
36 | add_edge_rate=0.1
37 | drop_feat_rate=0.9
38 | knn=10
39 | tau=0.6
40 | python main.py --embedder $embedder --dataset $dataset --task link --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
41 | for p in 0.05 0.1 0.15 0.2 0.25
42 | do
43 | python main.py --embedder $embedder --dataset $dataset --task link --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
44 | done
45 |
--------------------------------------------------------------------------------
/sh/metattack_maker.sh:
--------------------------------------------------------------------------------
1 | # photo
2 | dataset=photo
3 | sub_size=3000
4 | for ((i=1; i<=100; i++))
5 | do
6 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 6 --sub_size 3000 --ptb_n 300 --seed $i
7 | done
8 |
9 | # computers
10 | dataset=computers
11 | sub_size=3000
12 | for ((i=91; i<=200; i++))
13 | do
14 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 7 --sub_size 3000 --ptb_n 300 --seed $i
15 | done
16 |
17 | # cs
18 | dataset=cs
19 | sub_size=3000
20 | for ((i=1; i<=100; i++))
21 | do
22 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 7 --sub_size 3000 --ptb_n 200 --seed $i
23 | done
24 |
25 | # physics
26 | dataset=physics
27 | sub_size=3000
28 | for ((i=81; i<=300; i++))
29 | do
30 | python ./attack_maker/generate_metattack.py --dataset $dataset --device 6 --sub_size 3000 --ptb_n 200 --seed $i
31 | done
32 |
--------------------------------------------------------------------------------
/sh/node.sh:
--------------------------------------------------------------------------------
1 | ########## general ############
2 | device=1
3 | seed_n=10
4 | epochs=1000
5 | embedder=SPAGCL_node
6 |
7 | ########## cora ##############
8 | dataset=cora
9 | attack=meta
10 | n_subgraph=3000
11 | lr=0.005
12 | wd=0.01
13 | d_1=0.2
14 | d_2=0.3
15 | d_3=0.0
16 | add_edge_rate=0.5
17 | drop_feat_rate=0.7
18 | knn=10
19 | tau=0.4
20 | l1=5.0
21 | l2=3.0
22 | python main.py --embedder $embedder --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2
23 |
24 | l1=5.0
25 | l2=3.0
26 | p=0.05
27 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2
28 |
29 | l1=4.0
30 | l2=4.0
31 | p=0.1
32 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2
33 |
34 | l1=4.0
35 | l2=2.0
36 | p=0.15
37 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2
38 |
39 | l1=0.1
40 | l2=5.0
41 | p=0.2
42 | add_edge_rate=0.5
43 | drop_feat_rate=0.7
44 | d_1=0.1
45 | d_2=0.1
46 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2
47 |
48 |
49 | l1=0.5
50 | l2=5.0
51 | p=0.25
52 | add_edge_rate=0.5
53 | drop_feat_rate=0.3
54 | d_1=0.1
55 | d_2=0.2
56 | lr=0.01
57 | tau=0.4
58 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node --lambda_1 $l1 --lambda_2 $l2
59 |
60 |
61 | dataset=cora
62 | attack=nettack
63 | n_subgraph=3000
64 | lr=0.01
65 | wd=0.0001
66 | d_1=0.1
67 | d_2=0.3
68 | d_3=0.0
69 | add_edge_rate=0.7
70 | drop_feat_rate=0.7
71 | knn=10
72 | tau=0.4
73 | python main.py --embedder $embedder --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node
74 | for p in 1 2 3 4 5
75 | do
76 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node
77 | done
78 |
79 | attack=random
80 | n_subgraph=3000
81 | lr=0.01
82 | wd=0.01
83 | d_1=0.2
84 | d_2=0.1
85 | d_3=0.0
86 | add_edge_rate=0.1
87 | drop_feat_rate=0.5
88 | knn=10
89 | tau=0.4
90 | python main.py --embedder $embedder --dataset $dataset --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node
91 | for p in 0.2 0.4 0.6 0.8 1.0
92 | do
93 | python main.py --embedder $embedder --dataset $dataset --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --task node
94 | done
95 |
96 |
97 | ########## citeseer ##############
98 | dataset=citeseer
99 | attack=meta
100 | n_subgraph=3000
101 | lr=0.01
102 | wd=0.00001
103 | d_1=0.2
104 | d_2=0.1
105 | d_3=0.0
106 | add_edge_rate=0.1
107 | drop_feat_rate=0.9
108 | knn=10
109 | tau=0.6
110 | l1=4.0
111 | l2=3.0
112 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
113 |
114 |
115 | l1=2.0
116 | l2=5.0
117 | p=0.05
118 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
119 |
120 | l1=2.0
121 | l2=2.0
122 | p=0.1
123 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
124 |
125 |
126 | l1=2.0
127 | l2=2.0
128 | p=0.15
129 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
130 |
131 |
132 | l1=4.0
133 | l2=5.0
134 | p=0.2
135 | add_edge_rate=0.7
136 | drop_feat_rate=0.9
137 | d_1=0.2
138 | d_2=0.1
139 | lr=0.01
140 | tau=0.6
141 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
142 |
143 |
144 | l1=2.0
145 | l2=5.0
146 | p=0.25
147 | add_edge_rate=0.9
148 | drop_feat_rate=0.7
149 | d_1=0.2
150 | d_2=0.1
151 | lr=0.01
152 | tau=0.8
153 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
154 |
155 |
156 | attack=nettack
157 | n_subgraph=3000
158 | lr=0.001
159 | wd=0.01
160 | d_1=0.1
161 | d_2=0.1
162 | d_3=0.0
163 | add_edge_rate=0.7
164 | drop_feat_rate=0.9
165 | knn=5
166 | tau=0.6
167 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
168 | for p in 1 2 3 4 5
169 | do
170 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
171 | done
172 |
173 | attack=random
174 | n_subgraph=3000
175 | lr=0.001
176 | wd=0.00001
177 | d_1=0.1
178 | d_2=0.4
179 | d_3=0.0
180 | add_edge_rate=0.5
181 | drop_feat_rate=0.7
182 | knn=5
183 | tau=0.8
184 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
185 | for p in 0.2 0.4 0.6 0.8 1.0
186 | do
187 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
188 | done
189 |
190 |
191 | ########## pubmed ##############
192 | dataset=pubmed
193 | attack=meta
194 | n_subgraph=1000
195 | lr=0.001
196 | wd=0.00001
197 | d_1=0.3
198 | d_2=0.2
199 | d_3=0.0
200 | add_edge_rate=0.5
201 | drop_feat_rate=0.7
202 | knn=10
203 | tau=0.4
204 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
205 | for p in 0.05 0.1 0.15 0.2 0.25
206 | do
207 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
208 | done
209 |
210 | attack=nettack
211 | n_subgraph=1000
212 | lr=0.005
213 | wd=0.01
214 | d_1=0.5
215 | d_2=0.5
216 | d_3=0.0
217 | add_edge_rate=0.5
218 | drop_feat_rate=0.0
219 | knn=10
220 | tau=0.8
221 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
222 | for p in 1 2 3 4 5
223 | do
224 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
225 | done
226 |
227 | attack=random
228 | n_subgraph=1000
229 | lr=0.001
230 | wd=0.00001
231 | d_1=0.4
232 | d_2=0.1
233 | d_3=0.0
234 | add_edge_rate=0.3
235 | drop_feat_rate=0.0
236 | knn=10
237 | tau=0.4
238 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
239 | for p in 0.2 0.4 0.6 0.8 1.0
240 | do
241 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
242 | done
243 |
244 |
245 | ## photo
246 | attack=meta
247 | dataset=photo
248 | n_subgraph=5000
249 | lr=0.01
250 | wd=0.00001
251 | d_1=0.3
252 | d_2=0.2
253 | d_3=0.0
254 | add_edge_rate=0.1
255 | drop_feat_rate=0.0
256 | knn=10
257 | tau=0.4
258 | seed_n=10
259 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
260 | for p in 0.05 0.1 0.15 0.2 0.25
261 | do
262 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
263 | done
264 |
265 |
266 | ## computers
267 | attack=meta
268 | dataset=computers
269 | n_subgraph=5000
270 | lr=0.01
271 | wd=0.01
272 | d_1=0.3
273 | d_2=0.2
274 | d_3=0.0
275 | add_edge_rate=0.1
276 | drop_feat_rate=0.0
277 | knn=10
278 | tau=0.4
279 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
280 | for p in 0.05 0.1 0.15 0.2 0.25
281 | do
282 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
283 | done
284 |
285 | ## cs
286 | attack=meta
287 | dataset=cs
288 | n_subgraph=5000
289 | lr=0.01
290 | wd=0.001
291 | d_1=0.3
292 | d_2=0.2
293 | d_3=0.0
294 | add_edge_rate=0.7
295 | drop_feat_rate=0.0
296 | knn=10
297 | tau=0.4
298 |
299 | l1=0.5
300 | l2=5.0
301 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
302 |
303 | p=0.05
304 | l1=0.5
305 | l2=2.0
306 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
307 |
308 | p=0.1
309 | l1=0.1
310 | l2=5.0
311 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
312 |
313 | p=0.15
314 | l1=0.5
315 | l2=5.0
316 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
317 |
318 | p=0.2
319 | l1=0.5
320 | l2=5.0
321 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
322 |
323 | p=0.25
324 | l1=0.5
325 | l2=5.0
326 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau --lambda_1 $l1 --lambda_2 $l2
327 |
328 | ## physics
329 | attack=meta
330 | dataset=physics
331 | n_subgraph=5000
332 | lr=0.01
333 | wd=0.01
334 | d_1=0.3
335 | d_2=0.2
336 | d_3=0.0
337 | add_edge_rate=0.1
338 | drop_feat_rate=0.0
339 | knn=10
340 | tau=0.4
341 | python main.py --embedder $embedder --dataset $dataset --task node --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
342 | for p in 0.05 0.1 0.15 0.2 0.25
343 | do
344 | python main.py --embedder $embedder --dataset $dataset --task node --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs $epochs --add_edge_rate $add_edge_rate --drop_feat_rate $drop_feat_rate --sub_size $n_subgraph --d_1 $d_1 --d_2 $d_2 --d_3 $d_3 --lr $lr --wd $wd --knn $knn --tau $tau
345 | done
346 |
--------------------------------------------------------------------------------
/sh/save_emb.sh:
--------------------------------------------------------------------------------
1 | seed_n=10
2 | attack=meta
3 | device=0
4 | for dataset in cora citeseer pubmed photo computers cs physics
5 | do
6 | for embedder in SPAGCL_node
7 | do
8 | python main.py --embedder $embedder --dataset $dataset --task node --save_embed --attack_type evasive --attack $attack --device $device --seed_n $seed_n --epochs 0
9 | for p in 0.05 0.1 0.15 0.2 0.25
10 | do
11 | python main.py --embedder $embedder --dataset $dataset --task node --save_embed --ptb_rate $p --attack_type poison --attack $attack --device $device --seed_n $seed_n --epochs 0
12 | done
13 | done
14 | done
--------------------------------------------------------------------------------
/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .data import get_data
2 | from .utils import *
3 | from .transforms import get_graph_drop_transform
--------------------------------------------------------------------------------
/utils/create_lp_data.py:
--------------------------------------------------------------------------------
1 | # Modified from https://github.com/pcy1302/asp2vec/blob/master/src/create_dataset.py
2 | import pickle as pkl
3 | import copy
4 | import random
5 | import networkx as nx
6 | import numpy
7 | import os
8 | import sys
9 | import argparse
10 | import pandas as pd
11 | import pdb
12 | import sys
13 | # sys.path.append('../')
14 | from data import get_data
15 | from torch_geometric.utils import to_networkx, from_networkx
16 | import torch
17 | os.getcwd()
18 |
19 | random.seed(1995)
20 |
21 | def parse_args():
22 | # input arguments
23 | parser = argparse.ArgumentParser(description='preprocess')
24 |
25 | parser.add_argument('--dataset', type=str, default='cora')
26 | parser.add_argument('--remove_percent', type=float, default=0.5)
27 |
28 | return parser.parse_known_args()
29 |
30 | def LargestSubgraph(graph):
31 | """Returns the Largest connected-component of `graph`."""
32 | if graph.__class__ == nx.Graph:
33 | return LargestUndirectedSubgraph(graph)
34 | elif graph.__class__ == nx.DiGraph:
35 | largest_undirected_cc = LargestUndirectedSubgraph(nx.Graph(graph))
36 | directed_subgraph = nx.DiGraph()
37 | for (n1, n2) in graph.edges():
38 | if n2 in largest_undirected_cc and n1 in largest_undirected_cc[n2]:
39 | directed_subgraph.add_edge(n1, n2)
40 |
41 | return directed_subgraph
42 |
43 | def connected_component_subgraphs(G):
44 | for c in nx.connected_components(G):
45 | yield G.subgraph(c)
46 |
47 | def LargestUndirectedSubgraph(graph):
48 | """Returns the largest connected-component of undirected `graph`."""
49 | if nx.is_connected(graph):
50 | return graph
51 |
52 | # cc = list(nx.connected_component_subgraphs(graph))
53 | cc = list(connected_component_subgraphs(graph))
54 | sizes = [len(c) for c in cc]
55 | max_idx = sizes.index(max(sizes))
56 | return cc[max_idx]
57 |
58 | # return sizes_and_cc[-1][1]
59 |
60 |
61 | def SampleTestEdgesAndPruneGraph(graph, remove_percent, check_every=5):
62 | """Removes and returns `remove_percent` of edges from graph.
63 | Removal is random but makes sure graph stays connected."""
64 | graph = copy.deepcopy(graph)
65 | undirected_graph = graph.to_undirected()
66 |
67 | edges = list(copy.deepcopy(graph.edges()))
68 | random.shuffle(edges)
69 | remove_edges = int(len(edges) * remove_percent)
70 | num_edges_removed = 0
71 | currently_removing_edges = []
72 | removed_edges = []
73 | last_printed_prune_percentage = -1
74 | for j in range(len(edges)):
75 | n1, n2 = edges[j]
76 | graph.remove_edge(n1, n2)
77 | if n1 not in graph[n2]:
78 | undirected_graph.remove_edge(*(edges[j]))
79 | currently_removing_edges.append(edges[j])
80 | if j % check_every == 0:
81 | if nx.is_connected(undirected_graph):
82 | num_edges_removed += check_every
83 | removed_edges += currently_removing_edges
84 | currently_removing_edges = []
85 | else:
86 | for i in range(check_every):
87 | graph.add_edge(*(edges[j - i]))
88 | undirected_graph.add_edge(*(edges[j - i]))
89 | currently_removing_edges = []
90 | if not nx.is_connected(undirected_graph):
91 | print (' DID NOT RECOVER :(')
92 | return None
93 | prunned_percentage = int(100 * len(removed_edges) / remove_edges)
94 | rounded = (prunned_percentage / 10) * 10
95 | if rounded != last_printed_prune_percentage:
96 | last_printed_prune_percentage = rounded
97 | # print ('Partitioning into train/test. Progress=%i%%' % rounded)
98 |
99 | if len(removed_edges) >= remove_edges:
100 | break
101 |
102 | return graph, removed_edges
103 |
104 | def SampleNegativeEdges(graph, num_edges):
105 | """Samples `num_edges` edges from compliment of `graph`."""
106 | random_negatives = set()
107 | nodes = list(graph.nodes())
108 | while len(random_negatives) < num_edges:
109 | i1 = random.randint(0, len(nodes) - 1)
110 | i2 = random.randint(0, len(nodes) - 1)
111 | if i1 == i2:
112 | continue
113 | if i1 > i2:
114 | i1, i2 = i2, i1
115 | n1 = nodes[i1]
116 | n2 = nodes[i2]
117 | if graph.has_edge(n1, n2):
118 | continue
119 | random_negatives.add((n1, n2))
120 |
121 | return random_negatives
122 |
123 |
124 | def RandomNegativesPerNode(graph, test_nodes_PerNode, negatives_per_node=499):
125 | """For every node u in graph, samples 20 (u, v) where v is not in graph[u]."""
126 | node_list = list(graph.nodes())
127 | num_nodes = len(node_list)
128 | for n in test_nodes_PerNode:
129 | found_negatives = 0
130 | while found_negatives < negatives_per_node:
131 | n2 = node_list[random.randint(0, num_nodes - 1)]
132 | if n == n2 or n2 in graph[n]:
133 | continue
134 | test_nodes_PerNode[n].append(n2)
135 | found_negatives += 1
136 |
137 | return test_nodes_PerNode
138 |
139 |
140 | def NumberNodes(graph):
141 | """Returns a copy of `graph` where nodes are replaced by incremental ints."""
142 | node_list = sorted(graph.nodes())
143 | index = {n: i for (i, n) in enumerate(node_list)}
144 |
145 | newgraph = graph.__class__()
146 | for (n1, n2) in graph.edges():
147 | newgraph.add_edge(index[n1], index[n2])
148 |
149 | return newgraph, index
150 |
151 |
152 |
153 | def MakeDirectedNegatives(positive_edges):
154 | positive_set = set([(u, v) for (u, v) in list(positive_edges)])
155 | directed_negatives = []
156 | for (u, v) in positive_set:
157 | if (v, u) not in positive_set:
158 | directed_negatives.append((v, u))
159 | return numpy.array(directed_negatives, dtype='int32')
160 |
161 | def CreateDatasetFiles(graph, output_dir, remove_percent, partition=True):
162 | """Writes a number of dataset files to `output_dir`.
163 | Args:
164 | graph: nx.Graph or nx.DiGraph to simulate walks on and extract negatives.
165 | output_dir: files will be written in this directory, including:
166 | {train, train.neg, test, test.neg}.txt.npy, index.pkl, and
167 | if flag --directed is set, test.directed.neg.txt.npy.
168 | The files {train, train.neg}.txt.npy are used for model selection;
169 | {test, test.neg, test.directed.neg}.txt.npy will be used for calculating
170 | eval metrics; index.pkl contains information about the graph (# of nodes,
171 | mapping from original graph IDs to new assigned integer ones in
172 | [0, largest_cc_size-1].
173 | partition: If set largest connected component will be used and data will
174 | separated into train/test splits.
175 | Returns:
176 | The training graph, after node renumbering.
177 | """
178 |
179 | if not os.path.exists(output_dir):
180 | os.makedirs(output_dir)
181 |
182 | original_size = len(graph)
183 | if partition:
184 | graph = LargestSubgraph(graph)
185 | size_largest_cc = len(graph)
186 | else:
187 | size_largest_cc = -1
188 | graph, index = NumberNodes(graph)
189 |
190 | if partition:
191 | print("Generate dataset for link prediction")
192 | # For link prediction (50%:50%)
193 | train_graph, test_edges = SampleTestEdgesAndPruneGraph(graph, remove_percent)
194 |
195 | else:
196 | train_graph, test_edges = graph, []
197 |
198 | assert len(graph) == len(train_graph)
199 |
200 | # Sample negatives, to be equal to number of `test_edges` * 2.
201 | random_negatives = list(SampleNegativeEdges(graph, len(test_edges) + len(train_graph.edges())))
202 | random.shuffle(random_negatives)
203 | test_negatives = random_negatives[:len(test_edges)]
204 | # These are only used for evaluation, never training.
205 | train_eval_negatives = random_negatives[len(test_edges):]
206 |
207 | test_negatives = torch.from_numpy(numpy.array(test_negatives, dtype='int32')).long()
208 | test_edges = torch.from_numpy(numpy.array(test_edges, dtype='int32')).long()
209 | train_edges = torch.from_numpy(numpy.array(train_graph.edges(), dtype='int32')).long()
210 | train_eval_negatives = torch.from_numpy(numpy.array(train_eval_negatives, dtype='int32')).long()
211 |
212 | train_label = torch.cat((torch.ones(train_edges.size(0)), torch.zeros(train_eval_negatives.size(0))))
213 | test_label = torch.cat((torch.ones(test_edges.size(0)), torch.zeros(test_negatives.size(0))))
214 |
215 | data = {'train_edges': train_edges, 'train_edges_neg': train_eval_negatives, 'train_label': train_label,
216 | 'test_edges': test_edges, 'test_edges_neg': test_negatives, 'test_label': test_label}
217 |
218 | print("Size of train_edges: {}".format(len(train_edges)))
219 | print("Size of train_eval_negatives: {}".format(len(train_eval_negatives)))
220 | print("Size of test_edges: {}".format(len(test_edges)))
221 | print("Size of test_edges_neg: {}".format(len(test_negatives)))
222 |
223 | return data
224 |
225 | def main():
226 | args, unknown = parse_args()
227 | print(args)
228 |
229 | folder_dataset = f'./dataset/link'
230 |
231 | data = get_data('./dataset/', args.dataset, attack='meta', ptb_rate=0.0)[0]
232 | graph = to_networkx(data)
233 |
234 | # Create dataset files.
235 | print("Create {} dataset (Remove percent: {})".format(args.dataset, args.remove_percent))
236 | data_dict = CreateDatasetFiles(graph, folder_dataset, args.remove_percent)
237 |
238 | torch.save(data_dict, f'{folder_dataset}/{args.dataset}_link.pt')
239 |
240 |
241 | if __name__ == '__main__':
242 | main()
243 |
--------------------------------------------------------------------------------
/utils/data.py:
--------------------------------------------------------------------------------
1 | import numpy as np
2 | import torch
3 | from torch_geometric.data import Data
4 | from torch_geometric.utils import to_undirected, dense_to_sparse
5 | from deeprobust.graph.data import Dataset, PrePtbDataset
6 | import json
7 | import scipy.sparse as sp
8 | from deeprobust.graph.global_attack import Random
9 |
10 | def get_data(root, name, attack, ptb_rate):
11 | if name in ['cora', 'citeseer', 'pubmed']:
12 | data = Dataset(root=root, name=name, setting='prognn')
13 | adj, features, labels = data.adj, data.features, data.labels
14 | idx_train, idx_val, idx_test = data.idx_train, data.idx_val, data.idx_test
15 |
16 | dataset = Data()
17 |
18 | dataset.x = torch.from_numpy(features.toarray()).float()
19 | dataset.y = torch.from_numpy(labels).long()
20 | dataset.edge_index = dense_to_sparse(torch.from_numpy(adj.toarray()))[0].long()
21 |
22 | dataset.train_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_train)).bool()
23 | dataset.val_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_val)).bool()
24 | dataset.test_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_test)).bool()
25 |
26 | if attack == 'meta':
27 | if ptb_rate == 0.0:
28 | return [dataset]
29 | else:
30 | perturbed_data = PrePtbDataset(root=root,
31 | name=name,
32 | attack_method=attack,
33 | ptb_rate=ptb_rate)
34 | perturbed_adj = perturbed_data.adj
35 | dataset.edge_index = dense_to_sparse(torch.from_numpy(perturbed_adj.toarray()))[0].long()
36 | return [dataset]
37 | elif attack == 'nettack':
38 | if ptb_rate == 0.0:
39 | with open(f'{root}{name}_nettacked_nodes.json') as json_file:
40 | ptb_idx = json.load(json_file)
41 | idx_test_att = ptb_idx['attacked_test_nodes']
42 | dataset.test_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_test_att)).bool()
43 | else:
44 | perturbed_adj = sp.load_npz(f'{root}{name}_nettack_adj_{int(ptb_rate)}.0.npz')
45 | with open(f'{root}{name}_nettacked_nodes.json') as json_file:
46 | ptb_idx = json.load(json_file)
47 |
48 | dataset.edge_index = dense_to_sparse(torch.from_numpy(perturbed_adj.toarray()))[0].long()
49 | idx_test_att = ptb_idx['attacked_test_nodes']
50 | dataset.test_mask = torch.from_numpy(np.in1d(np.arange(len(labels)), idx_test_att)).bool()
51 |
52 | return [dataset]
53 |
54 | elif attack == 'random':
55 | if ptb_rate == 0.0:
56 | return [dataset]
57 | attacker = Random()
58 | n_perturbations = int(ptb_rate * (dataset.edge_index.shape[1]//2))
59 | attacker.attack(adj, n_perturbations, type='add')
60 | perturbed_adj = attacker.modified_adj
61 | dataset.edge_index = dense_to_sparse(torch.from_numpy(perturbed_adj.toarray()))[0].long()
62 | return [dataset]
63 |
64 | else:
65 | if name in ['photo', 'computers', 'cs', 'physics']:
66 | from torch_geometric.datasets import Planetoid, Amazon, Coauthor
67 | # if name in ['cora_pyg', 'citeseer_pyg']:
68 | # data = Planetoid(root=root, name=name.split('_')[0])[0]
69 | if name in ['photo', 'computers']:
70 | data = Amazon(root=root, name=name)[0]
71 | if ptb_rate > 0:
72 | edge = torch.load(f'{root}{name}_{attack}_adj_{ptb_rate}.pt').cpu()
73 | data.edge_index = edge
74 | else:
75 | data.edge_index = to_undirected(data.edge_index)
76 | data = create_masks(data)
77 | elif name in ['cs', 'physics']:
78 | data = Coauthor(root=root, name=name)[0]
79 | if ptb_rate > 0:
80 | edge = torch.load(f'{root}{name}_{attack}_adj_{ptb_rate}.pt').cpu()
81 | data.edge_index = edge
82 | else:
83 | data.edge_index = to_undirected(data.edge_index)
84 | data = create_masks(data)
85 |
86 | elif name in ['squirrel', 'chameleon', 'actor', 'cornell', 'wisconsin', 'texas']:
87 | from torch_geometric.datasets import WikipediaNetwork, Actor, WebKB
88 | if name in ['squirrel', 'chameleon']:
89 | data = WikipediaNetwork(root=root, name=name)[0]
90 | data.edge_index = to_undirected(data.edge_index)
91 | data = create_masks(data)
92 | if name in ['actor']:
93 | data = Actor(root=root)[0]
94 | data.edge_index = to_undirected(data.edge_index)
95 | data = create_masks(data)
96 | if name in ['cornell', 'wisconsin', 'texas']:
97 | data = WebKB(root=root, name=name)[0]
98 | data.edge_index = to_undirected(data.edge_index)
99 | data = create_masks(data)
100 |
101 | return [data]
102 |
103 |
104 | def create_masks(data):
105 | """
106 | Splits data into training, validation, and test splits in a stratified manner if
107 | it is not already splitted. Each split is associated with a mask vector, which
108 | specifies the indices for that split. The data will be modified in-place
109 | :param data: Data object
110 | :return: The modified data
111 | """
112 | tr = 0.1
113 | vl = 0.1
114 | tst = 0.8
115 | if not hasattr(data, "val_mask"):
116 | _train_mask = _val_mask = _test_mask = None
117 |
118 | for i in range(20):
119 | labels = data.y.numpy()
120 | dev_size = int(labels.shape[0] * vl)
121 | test_size = int(labels.shape[0] * tst)
122 |
123 | perm = np.random.permutation(labels.shape[0])
124 | test_index = perm[:test_size]
125 | dev_index = perm[test_size:test_size + dev_size]
126 |
127 | data_index = np.arange(labels.shape[0])
128 | test_mask = torch.tensor(np.in1d(data_index, test_index), dtype=torch.bool)
129 | dev_mask = torch.tensor(np.in1d(data_index, dev_index), dtype=torch.bool)
130 | train_mask = ~(dev_mask + test_mask)
131 | test_mask = test_mask.reshape(1, -1)
132 | dev_mask = dev_mask.reshape(1, -1)
133 | train_mask = train_mask.reshape(1, -1)
134 |
135 | if _train_mask is None:
136 | _train_mask = train_mask
137 | _val_mask = dev_mask
138 | _test_mask = test_mask
139 |
140 | else:
141 | _train_mask = torch.cat((_train_mask, train_mask), dim=0)
142 | _val_mask = torch.cat((_val_mask, dev_mask), dim=0)
143 | _test_mask = torch.cat((_test_mask, test_mask), dim=0)
144 |
145 | data.train_mask = _train_mask.squeeze()
146 | data.val_mask = _val_mask.squeeze()
147 | data.test_mask = _test_mask.squeeze()
148 |
149 | elif hasattr(data, "val_mask") and len(data.val_mask.shape) == 1:
150 | data.train_mask = data.train_mask.T
151 | data.val_mask = data.val_mask.T
152 | data.test_mask = data.test_mask.T
153 |
154 | else:
155 | num_folds = torch.min(torch.tensor(data.train_mask.size())).item()
156 | data.train_mask = data.train_mask.T
157 | data.val_mask = data.val_mask.T
158 | if len(data.test_mask.size()) == 1:
159 | data.test_mask = data.test_mask.unsqueeze(0).expand(num_folds, -1)
160 | else:
161 | data.test_mask = data.test_mask.T
162 |
163 | return data
164 |
165 |
--------------------------------------------------------------------------------
/utils/transforms.py:
--------------------------------------------------------------------------------
1 | import copy
2 |
3 | import torch
4 | from torch_geometric.utils.dropout import dropout_adj
5 | from torch_geometric.utils import add_self_loops
6 | from torch_geometric.transforms import Compose
7 |
8 |
9 | class DropFeatures:
10 | r"""Drops node features with probability p."""
11 | def __init__(self, p=None, precomputed_weights=True):
12 | assert 0. < p < 1., 'Dropout probability has to be between 0 and 1, but got %.2f' % p
13 | self.p = p
14 |
15 | def __call__(self, data):
16 | drop_mask = torch.empty((data.x.size(1),), dtype=torch.float32, device=data.x.device).uniform_(0, 1) < self.p
17 | data.x[:, drop_mask] = 0
18 | return data
19 |
20 | def __repr__(self):
21 | return '{}(p={})'.format(self.__class__.__name__, self.p)
22 |
23 |
24 | class DropEdges:
25 | r"""Drops edges with probability p."""
26 | def __init__(self, p, force_undirected=False):
27 | assert 0. < p < 1., 'Dropout probability has to be between 0 and 1, but got %.2f' % p
28 |
29 | self.p = p
30 | self.force_undirected = force_undirected
31 |
32 | def __call__(self, data):
33 | edge_index = data.edge_index
34 | edge_attr = data.edge_attr if 'edge_attr' in data else None
35 |
36 | edge_index, edge_attr = dropout_adj(edge_index, edge_attr, p=self.p, force_undirected=self.force_undirected)
37 | # edge_index = add_self_loops(edge_index)
38 |
39 | data.edge_index = edge_index
40 | if edge_attr is not None:
41 | data.edge_attr = edge_attr
42 | return data
43 |
44 | def __repr__(self):
45 | return '{}(p={}, force_undirected={})'.format(self.__class__.__name__, self.p, self.force_undirected)
46 |
47 |
48 | def get_graph_drop_transform(drop_edge_p, drop_feat_p):
49 | transforms = list()
50 |
51 | # make copy of graph
52 | transforms.append(copy.deepcopy)
53 |
54 | # drop edges
55 | if drop_edge_p > 0.:
56 | transforms.append(DropEdges(drop_edge_p))
57 |
58 | # drop features
59 | if drop_feat_p > 0.:
60 | transforms.append(DropFeatures(drop_feat_p))
61 | return Compose(transforms)
62 |
--------------------------------------------------------------------------------
/utils/utils.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import random, os
3 | import numpy as np
4 | import torch.nn.functional as F
5 | from torch_geometric.utils.sparse import dense_to_sparse
6 | from torch_geometric.utils import to_dense_adj, add_self_loops
7 |
8 | def to_numpy(tensor):
9 | return tensor.detach().cpu().numpy()
10 |
11 | def dense_to_sparse_adj(edge_index, n_node):
12 | return torch.sparse.FloatTensor(edge_index,
13 | torch.ones(edge_index.shape[1]).to(edge_index.device),
14 | [n_node, n_node])
15 |
16 | def dense_to_sparse_x(feat_index, n_node, n_dim):
17 | return torch.sparse.FloatTensor(feat_index,
18 | torch.ones(feat_index.shape[1]).to(feat_index.device),
19 | [n_node, n_dim])
20 |
21 | def to_dense_subadj(edge_index, subsize):
22 | edge = add_self_loops(edge_index, num_nodes=subsize)[0]
23 | return to_dense_adj(edge)[0].fill_diagonal_(0.0)
24 |
25 | def set_cuda_device(device_num):
26 | if torch.cuda.is_available():
27 | device = torch.device(f'cuda:{device_num}' if torch.cuda.is_available() else 'cpu')
28 | torch.cuda.set_device(device)
29 |
30 | def enumerateConfig(args):
31 | args_names = []
32 | args_vals = []
33 | for arg in vars(args):
34 | args_names.append(arg)
35 | args_vals.append(getattr(args, arg))
36 |
37 | return args_names, args_vals
38 |
39 | def config2string(args):
40 | args_names, args_vals = enumerateConfig(args)
41 | st = ''
42 | for name, val in zip(args_names, args_vals):
43 | if val == False:
44 | continue
45 |
46 | if name not in ['device', 'patience', 'epochs', 'save_dir', 'in_dim', 'n_class', 'best_epoch', 'save_fig', 'n_node', 'n_degree', 'attack', 'attack_type', 'ptb_rate', 'verbose', 'mm', '']:
47 | st_ = "{}:{} / ".format(name, val)
48 | st += st_
49 |
50 |
51 | return st[:-1]
52 |
53 | def set_everything(seed=42):
54 | random.seed(seed)
55 | os.environ['PYTHONHASHSEED'] = str(seed)
56 | np.random.seed(seed)
57 | torch.manual_seed(seed)
58 | torch.cuda.manual_seed_all(seed)
59 | torch.backends.cudnn.deterministic = True
60 | # torch.autograd.set_detect_anomaly(True)
61 | torch.backends.cudnn.benchmark = False
62 | os.environ['CUDA_LAUNCH_BLOCKING'] = '1' # specify GPUs locally
63 |
64 | def ensure_dir(file_path):
65 | directory = os.path.dirname(file_path)
66 | if not os.path.exists(directory):
67 | os.makedirs(directory)
68 |
--------------------------------------------------------------------------------