├── .gitignore
├── README.md
├── cifar-10
├── README.md
├── img
│ ├── cifar_grad_default.jpg
│ ├── cifar_large_l2_default.jpg
│ ├── cifar_large_linf_default.jpg
│ ├── cifar_learning_curve_l2.jpg
│ ├── cifar_learning_curve_linf.jpg
│ └── cifar_learning_curve_std.jpg
├── main.py
├── src
│ ├── argument.py
│ ├── attack
│ │ ├── __init__.py
│ │ └── fast_gradient_sign_untargeted.py
│ ├── model
│ │ ├── __init__.py
│ │ ├── madry_model.py
│ │ └── model.py
│ ├── utils
│ │ ├── __init__.py
│ │ └── utils.py
│ └── visualization
│ │ ├── __init__.py
│ │ └── vanilla_backprop.py
├── train.sh
├── visualize.py
└── visualize_attack.py
└── mnist
├── README.md
├── img
├── mnist_grad_default.jpg
├── mnist_large_l2_.jpg
├── mnist_learning_curve_l2.jpg
├── mnist_learning_curve_linf.jpg
└── mnist_learning_curve_std.jpg
├── main.py
├── src
├── argument.py
├── attack
│ ├── __init__.py
│ └── fast_gradient_sign_untargeted.py
├── model
│ ├── __init__.py
│ └── model.py
├── read_log.py
├── utils
│ ├── __init__.py
│ └── utils.py
└── visualization
│ ├── __init__.py
│ └── vanilla_backprop.py
├── visualize.py
└── visualize_attack.py
/.gitignore:
--------------------------------------------------------------------------------
1 | __pycache__/
2 | log/
3 |
4 | # Compiled source #
5 | ###################
6 | *.com
7 | *.class
8 | *.dll
9 | *.exe
10 | *.o
11 | *.so
12 | sftp-config.json
13 | read_log.py
14 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Adversarial Training and Visualization
2 |
3 | The repo is the PyTorch-1.0 implementation for the adversarial training on MNIST/CIFAR-10. And I also reproduce part of the visualization results in [1].
4 |
5 | **Note**: Not an official implementation.
6 |
7 | ## Adversarial Training
8 |
9 |
10 |
11 |
12 | Objective Function |
13 |
14 |
15 | Standard Training |
16 | Adversarial Training |
17 |
18 |
19 | |
20 | |
21 |
22 |
23 |
24 |
25 | where p in the table is usually 2 or inf.
26 |
27 | The objective of standard and adversarial training is fundamentally different. In standard training, the classifier minimize the loss computed from the original training data, while in adversarial training, it trains with the worst-case around the original data.
28 |
29 | ## Visualization
30 |
31 | In [1], the authors discover that the features learned by the robustness classifier are more human-perceivable. Related results are shown in mnist/cifar-10 folder.
32 |
33 | ## Implementation
34 |
35 | Part of the codes in this repo are borrowed/modified from [2], [3], [4] and [5].
36 |
37 | ## References:
38 |
39 | [1] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry. *Robustness May Be at Odds with Accuracy*, https://arxiv.org/abs/1805.12152
40 |
41 | [2] https://github.com/MadryLab/mnist_challenge
42 |
43 | [3] https://github.com/MadryLab/cifar10_challenge
44 |
45 | [4] https://github.com/xternalz/WideResNet-pytorch
46 |
47 | [5] https://github.com/utkuozbulak/pytorch-cnn-visualizations
48 |
49 |
50 | ## Contact
51 | Yi-Lin Sung, corumlouis123@gmail.com
52 |
--------------------------------------------------------------------------------
/cifar-10/README.md:
--------------------------------------------------------------------------------
1 | # Adversarial Training and Visualization on CIFAR-10
2 |
3 |
4 | ## Update
5 | * (2020/8/27)
6 | 1. To match the implementation of [madry_cifar10](https://github.com/MadryLab/cifar10_challenge), we update the default learning rate to `0.1`, the activation function of model to `LeakyReLU(0.1)`, and the optimizer change to `torch.SGD`.
7 | 2. Add new experiment in *Quantitative Results*, which match the results in [madry_cifar10](https://github.com/MadryLab/cifar10_challenge).
8 | 3. Add checkpoints for the updated model and delete the old ones.
9 | 4. Update codes structure. Pull `main.py`, `visualize.py` and `visualize_attack.py` out of `src` folder.
10 | * (2019/12/14)
11 | 1. Add `madry_model.py`, which contains the same model used in [madry_cifar10](https://github.com/MadryLab/cifar10_challenge), in `src/model`.
12 | 2. Add `count_parameters(model)` in `model.py` and `madry_model.py`, and it can compute the number of all the trainable parameters.
13 | 3. Flag `use_pseudo_label`, which determine whether to use model's prediction as the target, is added in to `trainer.test()`, and the default value is `False`.
14 | 4. Update the "Experiments" and the "Execution" part in this Readme.
15 | 5. Checkpoints of adversarial training in cifar-10 are provided.
16 | * (2019/4/18) Change the default alpha from 2 to 2/255, and update the results.
17 |
18 | ## Results
19 |
20 | Note that the experiments only conduct 1 time.
21 |
22 | ### Learning Curves
23 |
24 | Epsilon in linf (l2) training is 0.0157 (0.314). [0.0157=4/255, 0.314=80/255]
25 |
26 |
27 |
28 |
29 | Standard Training |
30 | l_inf Training |
31 | l_2 Training |
32 |
33 |
34 | |
35 | |
36 | |
37 |
38 |
39 | Standard Accuracy (train/test) |
40 | Robustness Accuracy (train/test) |
41 | Standard Accuracy (train/test) |
42 | Robustness Accuracy (train/test) |
43 | Standard Accuracy (train/test) |
44 | Robustness Accuracy (train/test) |
45 |
46 |
47 | 92.19/87.14 |
48 | 0.00/7.85 |
49 | 79.69/78.09 |
50 | 61.72/63.8 |
51 | 89.84/85.39 |
52 | 76.56/77.76 |
53 |
54 |
55 | Madry's Model Standard Accuracy (train/test) |
56 | Madry's Model Robustness Accuracy (train/test) |
57 | Madry's Model Standard Accuracy (train/test) |
58 | Madry's Model Robustness Accuracy (train/test) |
59 | Madry's Model Standard Accuracy (train/test) |
60 | Madry's Model Robustness Accuracy (train/test) |
61 |
62 |
63 | - |
64 | - |
65 | -/79.22 |
66 | -/55.97 |
67 | -/85.81 |
68 | -/71.87 |
69 |
70 |
71 |
72 |
73 | (Only refer to those results which are not Madry's Model) Note that in testing mode, the target label used in creating the adversarial example is the most confident prediction of the model, not the ground truth. Therefore, sometimes the testing robustness is higher than training robustness, when the prediction is wrong at first.
74 |
75 | Learning rate is manually changed during training:
76 |
77 | * `0.1` in iteration `[0, 40000]`
78 | * `0.01` in iteration `[40000, 60000]`
79 | * `0.001` in iteration `[60000, 76000]`
80 |
81 | the policy is followed https://github.com/MadryLab/cifar10_challenge.
82 |
83 |
84 | ### Quantitative Results
85 |
86 | * Defense model, standard accuracy = 86.66% (linf, epsilon=8/255) (train on PGD attack with 7 steps of size 2)
87 |
88 | | Attack | Robust Test Accuracy |
89 | | :---: | :---: |
90 | | PGD with 10 steps of size 2 (cross-entropy) | 48.37% |
91 | | PGD with 20 steps of size 1 (cross-entropy) | 48.04% |
92 |
93 | ### Visualization of Gradient with Respect to Input
94 |
95 | 
96 |
97 | ### The Adversarial Example with large epsilon
98 |
99 | The maximum epsilon is set to 4.7 (l2 norm) in this part.
100 |
101 | 
102 |
103 |
104 | ## Requirements:
105 | ```
106 | python >= 3.5
107 | torch >= 1.0
108 | torchvision >= 0.2.1
109 | numpy >= 1.16.1
110 | matplotlib >= 3.0.2
111 | ```
112 |
113 | ## Execution
114 |
115 | ### Training
116 |
117 | Standard training:
118 |
119 | ```
120 | python main.py --data_root [data directory]
121 | ```
122 |
123 | linf training:
124 |
125 | ```
126 | python main.py --data_root [data directory] -e 0.0157 -p 'linf' --adv_train --affix 'linf'
127 | ```
128 |
129 | l2 training:
130 |
131 | ```
132 | python main.py --data_root [data directory] -e 0.314 -p 'l2' --adv_train --affix 'l2'
133 | ```
134 |
135 | ### Testing
136 |
137 | change the setting if you want to do linf testing.
138 | ```
139 | python main.py --todo test --data_root [data directory] -e 0.314 -p 'l2' --load_checkpoint [your_model.pth]
140 | ```
141 |
142 | ### Visualization
143 |
144 | change the setting in `visualize.py` `visualize_attack.py` and if you want to do linf visualization.
145 |
146 | visualize gradient to input:
147 |
148 | ```
149 | python visualize.py --load_checkpoint [your_model.pth]
150 | ```
151 |
152 | visualize adversarial examples with larger epsilon
153 |
154 | ```
155 | python visualize_attack.py --load_checkpoint [your_model.pth]
156 | ```
157 |
158 | ## Checkpoints
159 | ### linf
160 | * epsilon=8/255, train on PGD attack with 7 steps of size 2: [checkpoint](https://drive.google.com/file/d/1-3AfpkLvPje5poY9ZettY05N8kZgFRAV/view?usp=sharing)
161 |
162 | ## Training Time
163 |
164 | Standard training: 78 s / 100 iterations
165 | Adversarial training: 784 s / 100 iterations
166 |
167 | where the batch size is 128 and train on NVIDIA GeForce GTX 1080.
168 |
--------------------------------------------------------------------------------
/cifar-10/img/cifar_grad_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_grad_default.jpg
--------------------------------------------------------------------------------
/cifar-10/img/cifar_large_l2_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_large_l2_default.jpg
--------------------------------------------------------------------------------
/cifar-10/img/cifar_large_linf_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_large_linf_default.jpg
--------------------------------------------------------------------------------
/cifar-10/img/cifar_learning_curve_l2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_l2.jpg
--------------------------------------------------------------------------------
/cifar-10/img/cifar_learning_curve_linf.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_linf.jpg
--------------------------------------------------------------------------------
/cifar-10/img/cifar_learning_curve_std.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_std.jpg
--------------------------------------------------------------------------------
/cifar-10/main.py:
--------------------------------------------------------------------------------
1 | import os
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | from torch.utils.data import DataLoader
6 |
7 | import torchvision as tv
8 |
9 | from time import time
10 | from src.model.madry_model import WideResNet
11 | from src.attack import FastGradientSignUntargeted
12 | from src.utils import makedirs, create_logger, tensor2cuda, numpy2cuda, evaluate, save_model
13 |
14 | from src.argument import parser, print_args
15 |
16 | class Trainer():
17 | def __init__(self, args, logger, attack):
18 | self.args = args
19 | self.logger = logger
20 | self.attack = attack
21 |
22 | def standard_train(self, model, tr_loader, va_loader=None):
23 | self.train(model, tr_loader, va_loader, False)
24 |
25 | def adversarial_train(self, model, tr_loader, va_loader=None):
26 | self.train(model, tr_loader, va_loader, True)
27 |
28 | def train(self, model, tr_loader, va_loader=None, adv_train=False):
29 | args = self.args
30 | logger = self.logger
31 |
32 | opt = torch.optim.SGD(model.parameters(), args.learning_rate,
33 | weight_decay=args.weight_decay,
34 | momentum=args.momentum)
35 | scheduler = torch.optim.lr_scheduler.MultiStepLR(opt,
36 | milestones=[40000, 60000],
37 | gamma=0.1)
38 | _iter = 0
39 |
40 | begin_time = time()
41 |
42 | for epoch in range(1, args.max_epoch+1):
43 | for data, label in tr_loader:
44 | data, label = tensor2cuda(data), tensor2cuda(label)
45 |
46 | if adv_train:
47 | # When training, the adversarial example is created from a random
48 | # close point to the original data point. If in evaluation mode,
49 | # just start from the original data point.
50 | adv_data = self.attack.perturb(data, label, 'mean', True)
51 | output = model(adv_data, _eval=False)
52 | else:
53 | output = model(data, _eval=False)
54 |
55 | loss = F.cross_entropy(output, label)
56 |
57 | opt.zero_grad()
58 | loss.backward()
59 | opt.step()
60 |
61 | if _iter % args.n_eval_step == 0:
62 | t1 = time()
63 |
64 | if adv_train:
65 | with torch.no_grad():
66 | stand_output = model(data, _eval=True)
67 | pred = torch.max(stand_output, dim=1)[1]
68 |
69 | # print(pred)
70 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
71 |
72 | pred = torch.max(output, dim=1)[1]
73 | # print(pred)
74 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
75 |
76 | else:
77 |
78 | adv_data = self.attack.perturb(data, label, 'mean', False)
79 |
80 | with torch.no_grad():
81 | adv_output = model(adv_data, _eval=True)
82 | pred = torch.max(adv_output, dim=1)[1]
83 | # print(label)
84 | # print(pred)
85 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
86 |
87 | pred = torch.max(output, dim=1)[1]
88 | # print(pred)
89 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
90 |
91 | t2 = time()
92 |
93 | logger.info(f'epoch: {epoch}, iter: {_iter}, lr={opt.param_groups[0]["lr"]}, '
94 | f'spent {time()-begin_time:.2f} s, tr_loss: {loss.item():.3f}')
95 |
96 | logger.info(f'standard acc: {std_acc:.3f}%, robustness acc: {adv_acc:.3f}%')
97 |
98 | # begin_time = time()
99 |
100 | # if va_loader is not None:
101 | # va_acc, va_adv_acc = self.test(model, va_loader, True)
102 | # va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0
103 |
104 | # logger.info('\n' + '='*30 + ' evaluation ' + '='*30)
105 | # logger.info('test acc: %.3f %%, test adv acc: %.3f %%, spent: %.3f' % (
106 | # va_acc, va_adv_acc, time() - begin_time))
107 | # logger.info('='*28 + ' end of evaluation ' + '='*28 + '\n')
108 |
109 | begin_time = time()
110 |
111 | if _iter % args.n_store_image_step == 0:
112 | tv.utils.save_image(torch.cat([data.cpu(), adv_data.cpu()], dim=0),
113 | os.path.join(args.log_folder, f'images_{_iter}.jpg'),
114 | nrow=16)
115 |
116 | if _iter % args.n_checkpoint_step == 0:
117 | file_name = os.path.join(args.model_folder, f'checkpoint_{_iter}.pth')
118 | save_model(model, file_name)
119 |
120 | _iter += 1
121 | # scheduler depends on training interation
122 | scheduler.step()
123 |
124 | if va_loader is not None:
125 | t1 = time()
126 | va_acc, va_adv_acc = self.test(model, va_loader, True, False)
127 | va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0
128 |
129 | t2 = time()
130 | logger.info('\n'+'='*20 +f' evaluation at epoch: {epoch} iteration: {_iter} ' \
131 | +'='*20)
132 | logger.info(f'test acc: {va_acc:.3f}%, test adv acc: {va_adv_acc:.3f}%, spent: {t2-t1:.3f} s')
133 | logger.info('='*28+' end of evaluation '+'='*28+'\n')
134 |
135 |
136 | def test(self, model, loader, adv_test=False, use_pseudo_label=False):
137 | # adv_test is False, return adv_acc as -1
138 |
139 | total_acc = 0.0
140 | num = 0
141 | total_adv_acc = 0.0
142 |
143 | with torch.no_grad():
144 | for data, label in loader:
145 | data, label = tensor2cuda(data), tensor2cuda(label)
146 |
147 | output = model(data, _eval=True)
148 |
149 | pred = torch.max(output, dim=1)[1]
150 | te_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy(), 'sum')
151 |
152 | total_acc += te_acc
153 | num += output.shape[0]
154 |
155 | if adv_test:
156 | # use predicted label as target label
157 | with torch.enable_grad():
158 | adv_data = self.attack.perturb(data,
159 | pred if use_pseudo_label else label,
160 | 'mean',
161 | False)
162 |
163 | adv_output = model(adv_data, _eval=True)
164 |
165 | adv_pred = torch.max(adv_output, dim=1)[1]
166 | adv_acc = evaluate(adv_pred.cpu().numpy(), label.cpu().numpy(), 'sum')
167 | total_adv_acc += adv_acc
168 | else:
169 | total_adv_acc = -num
170 |
171 | return total_acc / num , total_adv_acc / num
172 |
173 | def main(args):
174 |
175 | save_folder = '%s_%s' % (args.dataset, args.affix)
176 |
177 | log_folder = os.path.join(args.log_root, save_folder)
178 | model_folder = os.path.join(args.model_root, save_folder)
179 |
180 | makedirs(log_folder)
181 | makedirs(model_folder)
182 |
183 | setattr(args, 'log_folder', log_folder)
184 | setattr(args, 'model_folder', model_folder)
185 |
186 | logger = create_logger(log_folder, args.todo, 'info')
187 |
188 | print_args(args, logger)
189 |
190 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
191 |
192 | attack = FastGradientSignUntargeted(model,
193 | args.epsilon,
194 | args.alpha,
195 | min_val=0,
196 | max_val=1,
197 | max_iters=args.k,
198 | _type=args.perturbation_type)
199 |
200 | if torch.cuda.is_available():
201 | model.cuda()
202 |
203 | trainer = Trainer(args, logger, attack)
204 |
205 | if args.todo == 'train':
206 | transform_train = tv.transforms.Compose([
207 | tv.transforms.RandomCrop(32, padding=4, fill=0, padding_mode='constant'),
208 | tv.transforms.RandomHorizontalFlip(),
209 | tv.transforms.ToTensor(),
210 | ])
211 | tr_dataset = tv.datasets.CIFAR10(args.data_root,
212 | train=True,
213 | transform=transform_train,
214 | download=True)
215 |
216 | tr_loader = DataLoader(tr_dataset, batch_size=args.batch_size, shuffle=True, num_workers=4)
217 |
218 | # evaluation during training
219 | te_dataset = tv.datasets.CIFAR10(args.data_root,
220 | train=False,
221 | transform=tv.transforms.ToTensor(),
222 | download=True)
223 |
224 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
225 |
226 | trainer.train(model, tr_loader, te_loader, args.adv_train)
227 | elif args.todo == 'test':
228 | te_dataset = tv.datasets.CIFAR10(args.data_root,
229 | train=False,
230 | transform=tv.transforms.ToTensor(),
231 | download=True)
232 |
233 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
234 |
235 | checkpoint = torch.load(args.load_checkpoint)
236 | model.load_state_dict(checkpoint)
237 |
238 | std_acc, adv_acc = trainer.test(model, te_loader, adv_test=True, use_pseudo_label=False)
239 |
240 | print(f"std acc: {std_acc * 100:.3f}%, adv_acc: {adv_acc * 100:.3f}%")
241 |
242 | else:
243 | raise NotImplementedError
244 |
245 |
246 |
247 |
248 | if __name__ == '__main__':
249 | args = parser()
250 |
251 | os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
252 |
253 | main(args)
--------------------------------------------------------------------------------
/cifar-10/src/argument.py:
--------------------------------------------------------------------------------
1 | import argparse
2 |
3 | def parser():
4 | parser = argparse.ArgumentParser(description='Video Summarization')
5 | parser.add_argument('--todo', choices=['train', 'valid', 'test', 'visualize'], default='train',
6 | help='what behavior want to do: train | valid | test | visualize')
7 | parser.add_argument('--dataset', default='cifar-10', help='use what dataset')
8 | parser.add_argument('--data_root', default='/home/yilin/Data',
9 | help='the directory to save the dataset')
10 | parser.add_argument('--log_root', default='log',
11 | help='the directory to save the logs or other imformations (e.g. images)')
12 | parser.add_argument('--model_root', default='checkpoint', help='the directory to save the models')
13 | parser.add_argument('--load_checkpoint', default='./model/default/model.pth')
14 | parser.add_argument('--affix', default='default', help='the affix for the save folder')
15 |
16 | # parameters for generating adversarial examples
17 | parser.add_argument('--epsilon', '-e', type=float, default=0.0157,
18 | help='maximum perturbation of adversaries (4/255=0.0157)')
19 | parser.add_argument('--alpha', '-a', type=float, default=0.00784,
20 | help='movement multiplier per iteration when generating adversarial examples (2/255=0.00784)')
21 | parser.add_argument('--k', '-k', type=int, default=10,
22 | help='maximum iteration when generating adversarial examples')
23 |
24 | parser.add_argument('--batch_size', '-b', type=int, default=128, help='batch size')
25 | parser.add_argument('--max_epoch', '-m_e', type=int, default=200,
26 | help='the maximum numbers of the model see a sample')
27 | parser.add_argument('--learning_rate', '-lr', type=float, default=0.1, help='learning rate')
28 | parser.add_argument('--momentum', '-m', type=float, default=0.9, help='momentum for optimizer')
29 | parser.add_argument('--weight_decay', '-w', type=float, default=2e-4,
30 | help='the parameter of l2 restriction for weights')
31 |
32 | parser.add_argument('--gpu', '-g', default='0', help='which gpu to use')
33 | parser.add_argument('--n_eval_step', type=int, default=100,
34 | help='number of iteration per one evaluation')
35 | parser.add_argument('--n_checkpoint_step', type=int, default=4000,
36 | help='number of iteration to save a checkpoint')
37 | parser.add_argument('--n_store_image_step', type=int, default=4000,
38 | help='number of iteration to save adversaries')
39 | parser.add_argument('--perturbation_type', '-p', choices=['linf', 'l2'], default='linf',
40 | help='the type of the perturbation (linf or l2)')
41 |
42 | parser.add_argument('--adv_train', action='store_true')
43 |
44 | return parser.parse_args()
45 |
46 | def print_args(args, logger=None):
47 | for k, v in vars(args).items():
48 | if logger is not None:
49 | logger.info('{:<16} : {}'.format(k, v))
50 | else:
51 | print('{:<16} : {}'.format(k, v))
--------------------------------------------------------------------------------
/cifar-10/src/attack/__init__.py:
--------------------------------------------------------------------------------
1 | from .fast_gradient_sign_untargeted import FastGradientSignUntargeted
--------------------------------------------------------------------------------
/cifar-10/src/attack/fast_gradient_sign_untargeted.py:
--------------------------------------------------------------------------------
1 | """
2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks
3 |
4 | original author: Utku Ozbulak - github.com/utkuozbulak
5 | """
6 | import sys
7 | sys.path.append("..")
8 |
9 | import os
10 | import numpy as np
11 |
12 | import torch
13 | from torch import nn
14 | import torch.nn.functional as F
15 |
16 | from src.utils import tensor2cuda
17 |
18 | def project(x, original_x, epsilon, _type='linf'):
19 |
20 | if _type == 'linf':
21 | max_x = original_x + epsilon
22 | min_x = original_x - epsilon
23 |
24 | x = torch.max(torch.min(x, max_x), min_x)
25 |
26 | elif _type == 'l2':
27 | dist = (x - original_x)
28 |
29 | dist = dist.view(x.shape[0], -1)
30 |
31 | dist_norm = torch.norm(dist, dim=1, keepdim=True)
32 |
33 | mask = (dist_norm > epsilon).unsqueeze(2).unsqueeze(3)
34 |
35 | # dist = F.normalize(dist, p=2, dim=1)
36 |
37 | dist = dist / dist_norm
38 |
39 | dist *= epsilon
40 |
41 | dist = dist.view(x.shape)
42 |
43 | x = (original_x + dist) * mask.float() + x * (1 - mask.float())
44 |
45 | else:
46 | raise NotImplementedError
47 |
48 | return x
49 |
50 | class FastGradientSignUntargeted():
51 | b"""
52 | Fast gradient sign untargeted adversarial attack, minimizes the initial class activation
53 | with iterative grad sign updates
54 | """
55 | def __init__(self, model, epsilon, alpha, min_val, max_val, max_iters, _type='linf'):
56 | self.model = model
57 | # self.model.eval()
58 |
59 | # Maximum perturbation
60 | self.epsilon = epsilon
61 | # Movement multiplier per iteration
62 | self.alpha = alpha
63 | # Minimum value of the pixels
64 | self.min_val = min_val
65 | # Maximum value of the pixels
66 | self.max_val = max_val
67 | # Maximum numbers of iteration to generated adversaries
68 | self.max_iters = max_iters
69 | # The perturbation of epsilon
70 | self._type = _type
71 |
72 | def perturb(self, original_images, labels, reduction4loss='mean', random_start=False):
73 | # original_images: values are within self.min_val and self.max_val
74 |
75 | # The adversaries created from random close points to the original data
76 | if random_start:
77 | rand_perturb = torch.FloatTensor(original_images.shape).uniform_(
78 | -self.epsilon, self.epsilon)
79 | rand_perturb = tensor2cuda(rand_perturb)
80 | x = original_images + rand_perturb
81 | x.clamp_(self.min_val, self.max_val)
82 | else:
83 | x = original_images.clone()
84 |
85 | x.requires_grad = True
86 |
87 | # max_x = original_images + self.epsilon
88 | # min_x = original_images - self.epsilon
89 |
90 | self.model.eval()
91 |
92 | with torch.enable_grad():
93 | for _iter in range(self.max_iters):
94 | outputs = self.model(x, _eval=True)
95 |
96 | loss = F.cross_entropy(outputs, labels, reduction=reduction4loss)
97 |
98 | if reduction4loss == 'none':
99 | grad_outputs = tensor2cuda(torch.ones(loss.shape))
100 |
101 | else:
102 | grad_outputs = None
103 |
104 | grads = torch.autograd.grad(loss, x, grad_outputs=grad_outputs,
105 | only_inputs=True)[0]
106 |
107 | x.data += self.alpha * torch.sign(grads.data)
108 |
109 | # the adversaries' pixel value should within max_x and min_x due
110 | # to the l_infinity / l2 restriction
111 | x = project(x, original_images, self.epsilon, self._type)
112 | # the adversaries' value should be valid pixel value
113 | x.clamp_(self.min_val, self.max_val)
114 |
115 | self.model.train()
116 |
117 | return x
118 |
--------------------------------------------------------------------------------
/cifar-10/src/model/__init__.py:
--------------------------------------------------------------------------------
1 | from .model import *
--------------------------------------------------------------------------------
/cifar-10/src/model/madry_model.py:
--------------------------------------------------------------------------------
1 | # codes are import from https://github.com/xternalz/WideResNet-pytorch/blob/master/wideresnet.py
2 | # original author: xternalz
3 |
4 | import math
5 | import torch
6 | import torch.nn as nn
7 | import torch.nn.functional as F
8 |
9 | from src.utils import count_parameters
10 |
11 | class Expression(nn.Module):
12 | def __init__(self, func):
13 | super(Expression, self).__init__()
14 | self.func = func
15 |
16 | def forward(self, input):
17 | return self.func(input)
18 |
19 | class Model(nn.Module):
20 | def __init__(self, i_c=1, n_c=10):
21 | super(Model, self).__init__()
22 |
23 | self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True)
24 | self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
25 |
26 | self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True)
27 | self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
28 |
29 |
30 | self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1))
31 | self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True)
32 | self.fc2 = nn.Linear(1024, n_c)
33 |
34 |
35 | def forward(self, x_i, _eval=False):
36 |
37 | if _eval:
38 | # switch to eval mode
39 | self.eval()
40 | else:
41 | self.train()
42 |
43 | x_o = self.conv1(x_i)
44 | x_o = torch.relu(x_o)
45 | x_o = self.pool1(x_o)
46 |
47 | x_o = self.conv2(x_o)
48 | x_o = torch.relu(x_o)
49 | x_o = self.pool2(x_o)
50 |
51 | x_o = self.flatten(x_o)
52 |
53 | x_o = torch.relu(self.fc1(x_o))
54 |
55 | self.train()
56 |
57 | return self.fc2(x_o)
58 |
59 | class ChannelPadding(nn.Module):
60 | def __init__(self, in_planes, out_planes):
61 | super(ChannelPadding, self).__init__()
62 |
63 | self.register_buffer("padding",
64 | torch.zeros((out_planes - in_planes) // 2).view(1, -1, 1, 1))
65 |
66 | def forward(self, input):
67 | assert len(input.size()) == 4, "only support for 4-D tensor for now"
68 |
69 | padding = self.padding.expand(input.size(0), -1, input.size(2), input.size(3))
70 |
71 | return torch.cat([padding, input, padding], dim=1)
72 |
73 | class BasicBlock(nn.Module):
74 | def __init__(self, in_planes, out_planes, stride, dropRate=0.0):
75 | super(BasicBlock, self).__init__()
76 | self.bn1 = nn.BatchNorm2d(in_planes)
77 | self.relu1 = nn.LeakyReLU(0.1, inplace=True)
78 | self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
79 | padding=1, bias=False)
80 | self.bn2 = nn.BatchNorm2d(out_planes)
81 | self.relu2 = nn.LeakyReLU(0.1, inplace=True)
82 | self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,
83 | padding=1, bias=False)
84 | self.droprate = dropRate
85 | self.equalInOut = (in_planes == out_planes)
86 | # self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
87 | # padding=0, bias=False) or None
88 | self.poolpadShortcut = nn.Sequential(
89 | nn.AvgPool2d(kernel_size=stride, stride=stride),
90 | ChannelPadding(in_planes, out_planes)
91 | )
92 | def forward(self, x):
93 | if not self.equalInOut:
94 | x = self.relu1(self.bn1(x))
95 | else:
96 | out = self.relu1(self.bn1(x))
97 | out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))
98 | if self.droprate > 0:
99 | out = F.dropout(out, p=self.droprate, training=self.training)
100 | out = self.conv2(out)
101 | # return torch.add(x if self.equalInOut else self.convShortcut(x), out)
102 | return torch.add(
103 | x if self.equalInOut else self.poolpadShortcut(x),
104 | out
105 | )
106 |
107 | class NetworkBlock(nn.Module):
108 | def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):
109 | super(NetworkBlock, self).__init__()
110 | self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)
111 | def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):
112 | layers = []
113 | for i in range(int(nb_layers)):
114 | layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))
115 | return nn.Sequential(*layers)
116 | def forward(self, x):
117 | return self.layer(x)
118 |
119 | class WideResNet(nn.Module):
120 | def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0):
121 | super(WideResNet, self).__init__()
122 | nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor]
123 | assert((depth - 4) % 6 == 0)
124 | n = (depth - 4) / 6
125 | block = BasicBlock
126 | # 1st conv before any network block
127 | self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,
128 | padding=1, bias=False)
129 | # 1st block
130 | self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)
131 | # 2nd block
132 | self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)
133 | # 3rd block
134 | self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)
135 | # global average pooling and classifier
136 | self.bn1 = nn.BatchNorm2d(nChannels[3])
137 | self.relu = nn.LeakyReLU(0.1, inplace=True)
138 | self.fc = nn.Linear(nChannels[3], num_classes)
139 | self.nChannels = nChannels[3]
140 |
141 | for m in self.modules():
142 | if isinstance(m, nn.Conv2d):
143 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
144 | m.weight.data.normal_(0, math.sqrt(2. / n))
145 | elif isinstance(m, nn.BatchNorm2d):
146 | m.weight.data.fill_(1)
147 | m.bias.data.zero_()
148 | elif isinstance(m, nn.Linear):
149 | m.bias.data.zero_()
150 |
151 | def forward(self, x, _eval=False):
152 | if _eval:
153 | # switch to eval mode
154 | self.eval()
155 | else:
156 | self.train()
157 |
158 | out = self.conv1(x)
159 | out = self.block1(out)
160 | out = self.block2(out)
161 | out = self.block3(out)
162 | out = self.relu(self.bn1(out))
163 | out = F.avg_pool2d(out, 8)
164 | out = out.view(-1, self.nChannels)
165 |
166 | self.train()
167 |
168 | return self.fc(out)
169 |
170 |
171 | if __name__ == '__main__':
172 | i = torch.FloatTensor(4, 3, 32, 32)
173 |
174 | n = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
175 |
176 | i = i.cuda()
177 | n = n.cuda()
178 |
179 | print(n(i).size())
180 |
181 | print(count_parameters(n))
182 |
183 |
--------------------------------------------------------------------------------
/cifar-10/src/model/model.py:
--------------------------------------------------------------------------------
1 | # codes are import from https://github.com/xternalz/WideResNet-pytorch/blob/master/wideresnet.py
2 | # original author: xternalz
3 |
4 | import math
5 | import torch
6 | import torch.nn as nn
7 | import torch.nn.functional as F
8 |
9 | from src.utils import count_parameters
10 |
11 | class Expression(nn.Module):
12 | def __init__(self, func):
13 | super(Expression, self).__init__()
14 | self.func = func
15 |
16 | def forward(self, input):
17 | return self.func(input)
18 |
19 | class Model(nn.Module):
20 | def __init__(self, i_c=1, n_c=10):
21 | super(Model, self).__init__()
22 |
23 | self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True)
24 | self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
25 |
26 | self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True)
27 | self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
28 |
29 |
30 | self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1))
31 | self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True)
32 | self.fc2 = nn.Linear(1024, n_c)
33 |
34 |
35 | def forward(self, x_i, _eval=False):
36 |
37 | if _eval:
38 | # switch to eval mode
39 | self.eval()
40 | else:
41 | self.train()
42 |
43 | x_o = self.conv1(x_i)
44 | x_o = torch.relu(x_o)
45 | x_o = self.pool1(x_o)
46 |
47 | x_o = self.conv2(x_o)
48 | x_o = torch.relu(x_o)
49 | x_o = self.pool2(x_o)
50 |
51 | x_o = self.flatten(x_o)
52 |
53 | x_o = torch.relu(self.fc1(x_o))
54 |
55 | self.train()
56 |
57 | return self.fc2(x_o)
58 |
59 |
60 |
61 |
62 |
63 | class BasicBlock(nn.Module):
64 | def __init__(self, in_planes, out_planes, stride, dropRate=0.0):
65 | super(BasicBlock, self).__init__()
66 | self.bn1 = nn.BatchNorm2d(in_planes)
67 | self.relu1 = nn.LeakyReLU(0.1, inplace=True)
68 | self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
69 | padding=1, bias=False)
70 | self.bn2 = nn.BatchNorm2d(out_planes)
71 | self.relu2 = nn.LeakyReLU(0.1, inplace=True)
72 | self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,
73 | padding=1, bias=False)
74 | self.droprate = dropRate
75 | self.equalInOut = (in_planes == out_planes)
76 | self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
77 | padding=0, bias=False) or None
78 | def forward(self, x):
79 | if not self.equalInOut:
80 | x = self.relu1(self.bn1(x))
81 | else:
82 | out = self.relu1(self.bn1(x))
83 | out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))
84 | if self.droprate > 0:
85 | out = F.dropout(out, p=self.droprate, training=self.training)
86 | out = self.conv2(out)
87 | return torch.add(x if self.equalInOut else self.convShortcut(x), out)
88 |
89 | class NetworkBlock(nn.Module):
90 | def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):
91 | super(NetworkBlock, self).__init__()
92 | self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)
93 | def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):
94 | layers = []
95 | for i in range(int(nb_layers)):
96 | layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))
97 | return nn.Sequential(*layers)
98 | def forward(self, x):
99 | return self.layer(x)
100 |
101 | class WideResNet(nn.Module):
102 | def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0):
103 | super(WideResNet, self).__init__()
104 | nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor]
105 | assert((depth - 4) % 6 == 0)
106 | n = (depth - 4) / 6
107 | block = BasicBlock
108 | # 1st conv before any network block
109 | self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,
110 | padding=1, bias=False)
111 | # 1st block
112 | self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)
113 | # 2nd block
114 | self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)
115 | # 3rd block
116 | self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)
117 | # global average pooling and classifier
118 | self.bn1 = nn.BatchNorm2d(nChannels[3])
119 | self.relu = nn.LeakyReLU(0.1, inplace=True)
120 | self.fc = nn.Linear(nChannels[3], num_classes)
121 | self.nChannels = nChannels[3]
122 |
123 | for m in self.modules():
124 | if isinstance(m, nn.Conv2d):
125 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
126 | m.weight.data.normal_(0, math.sqrt(2. / n))
127 | elif isinstance(m, nn.BatchNorm2d):
128 | m.weight.data.fill_(1)
129 | m.bias.data.zero_()
130 | elif isinstance(m, nn.Linear):
131 | m.bias.data.zero_()
132 |
133 | def forward(self, x, _eval=False):
134 | if _eval:
135 | # switch to eval mode
136 | self.eval()
137 | else:
138 | self.train()
139 |
140 | out = self.conv1(x)
141 | out = self.block1(out)
142 | out = self.block2(out)
143 | out = self.block3(out)
144 | out = self.relu(self.bn1(out))
145 | out = F.avg_pool2d(out, 8)
146 | out = out.view(-1, self.nChannels)
147 |
148 | self.train()
149 |
150 | return self.fc(out)
151 |
152 |
153 | if __name__ == '__main__':
154 | i = torch.FloatTensor(4, 3, 32, 32)
155 |
156 | n = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
157 |
158 | # print(n(i).size())
159 |
160 | print(count_parameters(n))
161 |
162 |
--------------------------------------------------------------------------------
/cifar-10/src/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .utils import *
--------------------------------------------------------------------------------
/cifar-10/src/utils/utils.py:
--------------------------------------------------------------------------------
1 | import os
2 | import json
3 | import logging
4 |
5 | import numpy as np
6 |
7 | import torch
8 |
9 | class LabelDict():
10 | def __init__(self, dataset='cifar-10'):
11 | self.dataset = dataset
12 | if dataset == 'cifar-10':
13 | self.label_dict = {0: 'airplane', 1: 'automobile', 2: 'bird', 3: 'cat',
14 | 4: 'deer', 5: 'dog', 6: 'frog', 7: 'horse',
15 | 8: 'ship', 9: 'truck'}
16 |
17 | self.class_dict = {v: k for k, v in self.label_dict.items()}
18 |
19 | def label2class(self, label):
20 | assert label in self.label_dict, 'the label %d is not in %s' % (label, self.dataset)
21 | return self.label_dict[label]
22 |
23 | def class2label(self, _class):
24 | assert isinstance(_class, str)
25 | assert _class in self.class_dict, 'the class %s is not in %s' % (_class, self.dataset)
26 | return self.class_dict[_class]
27 |
28 | def list2cuda(_list):
29 | array = np.array(_list)
30 | return numpy2cuda(array)
31 |
32 | def numpy2cuda(array):
33 | tensor = torch.from_numpy(array)
34 |
35 | return tensor2cuda(tensor)
36 |
37 | def tensor2cuda(tensor):
38 | if torch.cuda.is_available():
39 | tensor = tensor.cuda()
40 |
41 | return tensor
42 |
43 | def one_hot(ids, n_class):
44 | # ---------------------
45 | # author:ke1th
46 | # source:CSDN
47 | # artical:https://blog.csdn.net/u012436149/article/details/77017832
48 | b"""
49 | ids: (list, ndarray) shape:[batch_size]
50 | out_tensor:FloatTensor shape:[batch_size, depth]
51 | """
52 |
53 | assert len(ids.shape) == 1, 'the ids should be 1-D'
54 | # ids = torch.LongTensor(ids).view(-1,1)
55 |
56 | out_tensor = torch.zeros(len(ids), n_class)
57 |
58 | out_tensor.scatter_(1, ids.cpu().unsqueeze(1), 1.)
59 |
60 | return out_tensor
61 |
62 | def evaluate(_input, _target, method='mean'):
63 | correct = (_input == _target).astype(np.float32)
64 | if method == 'mean':
65 | return correct.mean()
66 | else:
67 | return correct.sum()
68 |
69 |
70 | def create_logger(save_path='', file_type='', level='debug'):
71 |
72 | if level == 'debug':
73 | _level = logging.DEBUG
74 | elif level == 'info':
75 | _level = logging.INFO
76 |
77 | logger = logging.getLogger()
78 | logger.setLevel(_level)
79 |
80 | cs = logging.StreamHandler()
81 | cs.setLevel(_level)
82 | logger.addHandler(cs)
83 |
84 | if save_path != '':
85 | file_name = os.path.join(save_path, file_type + '_log.txt')
86 | fh = logging.FileHandler(file_name, mode='w')
87 | fh.setLevel(_level)
88 |
89 | logger.addHandler(fh)
90 |
91 | return logger
92 |
93 | def makedirs(path):
94 | if not os.path.exists(path):
95 | os.makedirs(path)
96 |
97 | def load_model(model, file_name):
98 | model.load_state_dict(
99 | torch.load(file_name, map_location=lambda storage, loc: storage))
100 |
101 | def save_model(model, file_name):
102 | torch.save(model.state_dict(), file_name)
103 |
104 | def count_parameters(model):
105 | # copy from https://discuss.pytorch.org/t/how-do-i-check-the-number-of-parameters-of-a-model/4325/8
106 | # baldassarre.fe's reply
107 | return sum(p.numel() for p in model.parameters() if p.requires_grad)
--------------------------------------------------------------------------------
/cifar-10/src/visualization/__init__.py:
--------------------------------------------------------------------------------
1 | from .vanilla_backprop import VanillaBackprop
--------------------------------------------------------------------------------
/cifar-10/src/visualization/vanilla_backprop.py:
--------------------------------------------------------------------------------
1 | """
2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-visualizations
3 |
4 | original author: Utku Ozbulak - github.com/utkuozbulak
5 | """
6 |
7 | import sys
8 | sys.path.append("..")
9 |
10 | import torch
11 |
12 | from src.utils import tensor2cuda, one_hot
13 |
14 | class VanillaBackprop():
15 | """
16 | Produces gradients generated with vanilla back propagation from the image
17 | """
18 | def __init__(self, model):
19 | self.model = model
20 |
21 | def generate_gradients(self, input_image, target_class):
22 | # Put model in evaluation mode
23 | self.model.eval()
24 |
25 | x = input_image.clone()
26 |
27 | x.requires_grad = True
28 |
29 | with torch.enable_grad():
30 | # Forward
31 | model_output = self.model(x)
32 | # Zero grads
33 | self.model.zero_grad()
34 |
35 | grad_outputs = one_hot(target_class, model_output.shape[1])
36 | grad_outputs = tensor2cuda(grad_outputs)
37 |
38 | grad = torch.autograd.grad(model_output, x, grad_outputs=grad_outputs,
39 | only_inputs=True)[0]
40 |
41 | self.model.train()
42 |
43 | return grad
44 |
--------------------------------------------------------------------------------
/cifar-10/train.sh:
--------------------------------------------------------------------------------
1 | python -m src.main.py --data_root '.' --affix std
2 | python -m src.main.py --data_root '.' -e 0.0157 -p 'linf' --adv_train --affix 'linf'
3 | python -m src.main.py --data_root '.' -e 0.314 -p 'l2' --adv_train --affix 'l2'
4 |
--------------------------------------------------------------------------------
/cifar-10/visualize.py:
--------------------------------------------------------------------------------
1 |
2 | import os
3 | import torch
4 | import torchvision as tv
5 | import numpy as np
6 |
7 | from torch.utils.data import DataLoader
8 |
9 | from src.utils import makedirs, tensor2cuda, load_model, LabelDict
10 | from argument import parser
11 | from src.visualization import VanillaBackprop
12 | from src.model.madry_model import WideResNet
13 |
14 | import matplotlib.pyplot as plt
15 |
16 | img_folder = 'img'
17 | makedirs(img_folder)
18 | out_num = 5
19 |
20 |
21 | args = parser()
22 |
23 | label_dict = LabelDict(args.dataset)
24 |
25 | te_dataset = tv.datasets.CIFAR10(args.data_root,
26 | train=False,
27 | transform=tv.transforms.ToTensor(),
28 | download=True)
29 |
30 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
31 |
32 |
33 | for data, label in te_loader:
34 |
35 | data, label = tensor2cuda(data), tensor2cuda(label)
36 |
37 |
38 | break
39 |
40 |
41 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
42 |
43 | load_model(model, args.load_checkpoint)
44 |
45 | if torch.cuda.is_available():
46 | model.cuda()
47 |
48 | VBP = VanillaBackprop(model)
49 |
50 | grad = VBP.generate_gradients(data, label)
51 |
52 | grad_flat = grad.view(grad.shape[0], -1)
53 | mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3)
54 | std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3)
55 |
56 | mean = mean.repeat(1, 1, data.shape[2], data.shape[3])
57 | std = std.repeat(1, 1, data.shape[2], data.shape[3])
58 |
59 | grad = torch.max(torch.min(grad, mean+3*std), mean-3*std)
60 |
61 | print(grad.min(), grad.max())
62 |
63 | grad -= grad.min()
64 |
65 | grad /= grad.max()
66 |
67 | grad = grad.cpu().numpy().squeeze() # (N, 28, 28)
68 |
69 | grad *= 255.0
70 |
71 | label = label.cpu().numpy()
72 |
73 | data = data.cpu().numpy().squeeze()
74 |
75 | data *= 255.0
76 |
77 | out_list = [data, grad]
78 |
79 | types = ['Original', 'Your Model']
80 |
81 | fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num)
82 |
83 | axs = _axs
84 |
85 | for j, _type in enumerate(types):
86 | axs[j, 0].set_ylabel(_type)
87 |
88 | # if j == 0:
89 | # cmap = 'gray'
90 | # else:
91 | # cmap = 'seismic'
92 |
93 | for i in range(out_num):
94 | axs[j, i].set_xlabel('%s' % label_dict.label2class(label[i]))
95 | img = out_list[j][i]
96 | # print(img)
97 | img = np.transpose(img, (1, 2, 0))
98 |
99 | img = img.astype(np.uint8)
100 | axs[j, i].imshow(img)
101 |
102 | axs[j, i].get_xaxis().set_ticks([])
103 | axs[j, i].get_yaxis().set_ticks([])
104 |
105 | plt.tight_layout()
106 | plt.savefig(os.path.join(img_folder, 'cifar_grad_%s.jpg' % args.affix))
107 |
108 | # types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
109 |
110 |
111 | # model_checkpoints = ['checkpoint/cifar-10_std/checkpoint_76000.pth',
112 | # 'checkpoint/cifar-10_linf/checkpoint_76000.pth',
113 | # 'checkpoint/cifar-10_l2/checkpoint_76000.pth']
114 |
115 |
116 | # out_list = []
117 |
118 | # for checkpoint in model_checkpoints:
119 |
120 | # model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
121 |
122 | # load_model(model, checkpoint)
123 |
124 | # if torch.cuda.is_available():
125 | # model.cuda()
126 |
127 | # VBP = VanillaBackprop(model)
128 |
129 | # grad = VBP.generate_gradients(data, label)
130 |
131 | # grad_flat = grad.view(grad.shape[0], -1)
132 | # mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3)
133 | # std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3)
134 |
135 | # mean = mean.repeat(1, 1, data.shape[2], data.shape[3])
136 | # std = std.repeat(1, 1, data.shape[2], data.shape[3])
137 |
138 | # grad = torch.max(torch.min(grad, mean+3*std), mean-3*std)
139 |
140 | # print(grad.min(), grad.max())
141 |
142 | # grad -= grad.min()
143 |
144 | # grad /= grad.max()
145 |
146 | # grad = grad.cpu().numpy().squeeze() # (N, 28, 28)
147 |
148 | # grad *= 255.0
149 |
150 | # out_list.append(grad)
151 |
152 | # data = data.cpu().numpy().squeeze() # (N, 28, 28)
153 | # data *= 255.0
154 | # label = label.cpu().numpy()
155 |
156 | # out_list.insert(0, data)
157 |
158 | # # normalize the grad
159 | # # length = torch.norm(grad, dim=3)
160 | # # length = torch.norm(length, dim=2)
161 | # # length = length.unsqueeze(2).unsqueeze(2)
162 | # # grad /= (length + 1e-5)
163 |
164 | # out_num = 5
165 |
166 | # fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num)
167 |
168 | # axs = _axs
169 |
170 |
171 | # for j, _type in enumerate(types):
172 | # axs[j, 0].set_ylabel(_type)
173 |
174 | # # if j == 0:
175 | # # cmap = 'gray'
176 | # # else:
177 | # # cmap = 'seismic'
178 |
179 | # for i in range(out_num):
180 |
181 | # data_id = i + 0
182 |
183 | # axs[j, i].set_xlabel('%s' % label_dict.label2class(label[data_id]))
184 |
185 | # img = out_list[j][data_id]
186 | # # print(img)
187 | # img = np.transpose(img, (1, 2, 0))
188 |
189 | # img = img.astype(np.uint8)
190 | # axs[j, i].imshow(img)
191 |
192 | # axs[j, i].get_xaxis().set_ticks([])
193 | # axs[j, i].get_yaxis().set_ticks([])
194 |
195 | # plt.tight_layout()
196 | # plt.savefig(os.path.join(img_folder, 'cifar_grad_%s.jpg' % args.affix))
--------------------------------------------------------------------------------
/cifar-10/visualize_attack.py:
--------------------------------------------------------------------------------
1 |
2 | import os
3 | import torch
4 | import torchvision as tv
5 | import numpy as np
6 |
7 | from torch.utils.data import DataLoader
8 |
9 | from src.utils import makedirs, tensor2cuda, load_model, LabelDict
10 | from argument import parser
11 | from src.visualization import VanillaBackprop
12 | from src.attack import FastGradientSignUntargeted
13 | from src.model.madry_model import WideResNet
14 |
15 | import matplotlib.pyplot as plt
16 |
17 | max_epsilon = 4.7
18 |
19 | perturbation_type = 'l2'
20 |
21 | out_num = 5
22 |
23 | img_folder = 'img'
24 | makedirs(img_folder)
25 |
26 | args = parser()
27 |
28 | label_dict = LabelDict(args.dataset)
29 |
30 | te_dataset = tv.datasets.CIFAR10(args.data_root,
31 | train=False,
32 | transform=tv.transforms.ToTensor(),
33 | download=True)
34 |
35 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
36 |
37 |
38 | for data, label in te_loader:
39 |
40 | data, label = tensor2cuda(data), tensor2cuda(label)
41 |
42 |
43 | break
44 |
45 |
46 | adv_list = []
47 | pred_list = []
48 |
49 | with torch.no_grad():
50 |
51 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
52 |
53 | load_model(model, args.load_checkpoint)
54 |
55 | if torch.cuda.is_available():
56 | model.cuda()
57 |
58 | attack = FastGradientSignUntargeted(model,
59 | max_epsilon,
60 | args.alpha,
61 | min_val=0,
62 | max_val=1,
63 | max_iters=args.k,
64 | _type=perturbation_type)
65 |
66 |
67 | adv_data = attack.perturb(data, label, 'mean', False)
68 |
69 | output = model(adv_data, _eval=True)
70 | pred = torch.max(output, dim=1)[1]
71 | adv_list.append(adv_data.cpu().numpy().squeeze() * 255.0) # (N, 28, 28)
72 | pred_list.append(pred.cpu().numpy())
73 |
74 | data = data.cpu().numpy().squeeze() # (N, 28, 28)
75 | data *= 255.0
76 | label = label.cpu().numpy()
77 |
78 | adv_list.insert(0, data)
79 |
80 | pred_list.insert(0, label)
81 |
82 |
83 | types = ['Original', 'Your Model']
84 |
85 | fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num)
86 |
87 | axs = _axs
88 |
89 | for j, _type in enumerate(types):
90 | axs[j, 0].set_ylabel(_type)
91 |
92 | for i in range(out_num):
93 | axs[j, i].set_xlabel('%s' % label_dict.label2class(pred_list[j][i]))
94 | img = adv_list[j][i]
95 | # print(img)
96 | img = np.transpose(img, (1, 2, 0))
97 |
98 | img = img.astype(np.uint8)
99 | axs[j, i].imshow(img)
100 |
101 | axs[j, i].get_xaxis().set_ticks([])
102 | axs[j, i].get_yaxis().set_ticks([])
103 |
104 | plt.tight_layout()
105 | plt.savefig(os.path.join(img_folder, 'cifar_large_%s_%s.jpg' % (perturbation_type, args.affix)))
106 | # plt.savefig(os.path.join(img_folder, 'test_%s.jpg' % (args.affix)))
107 |
108 |
109 | # types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
110 |
111 |
112 | # model_checkpoints = ['checkpoint/cifar-10_std/checkpoint_76000.pth',
113 | # 'checkpoint/cifar-10_linf/checkpoint_76000.pth',
114 | # 'checkpoint/cifar-10_l2/checkpoint_76000.pth']
115 |
116 | # adv_list = []
117 | # pred_list = []
118 |
119 | # max_epsilon = 4
120 |
121 | # perturbation_type = 'l2'
122 |
123 | # with torch.no_grad():
124 | # for checkpoint in model_checkpoints:
125 |
126 | # model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
127 |
128 | # load_model(model, checkpoint)
129 |
130 | # if torch.cuda.is_available():
131 | # model.cuda()
132 |
133 | # attack = FastGradientSignUntargeted(model,
134 | # max_epsilon,
135 | # args.alpha,
136 | # min_val=0,
137 | # max_val=1,
138 | # max_iters=args.k,
139 | # _type=perturbation_type)
140 |
141 |
142 | # adv_data = attack.perturb(data, label, 'mean', False)
143 |
144 | # output = model(adv_data, _eval=True)
145 | # pred = torch.max(output, dim=1)[1]
146 | # adv_list.append(adv_data.cpu().numpy().squeeze() * 255.0) # (N, 28, 28)
147 | # pred_list.append(pred.cpu().numpy())
148 |
149 | # data = data.cpu().numpy().squeeze() # (N, 28, 28)
150 | # data *= 255.0
151 | # label = label.cpu().numpy()
152 |
153 | # adv_list.insert(0, data)
154 |
155 | # pred_list.insert(0, label)
156 |
157 | # out_num = 5
158 |
159 | # fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num)
160 |
161 | # axs = _axs
162 |
163 | # for j, _type in enumerate(types):
164 | # axs[j, 0].set_ylabel(_type)
165 |
166 | # for i in range(out_num):
167 | # axs[j, i].set_xlabel('%s' % label_dict.label2class(pred_list[j][i]))
168 | # img = adv_list[j][i]
169 | # # print(img)
170 | # img = np.transpose(img, (1, 2, 0))
171 |
172 | # img = img.astype(np.uint8)
173 | # axs[j, i].imshow(img)
174 |
175 | # axs[j, i].get_xaxis().set_ticks([])
176 | # axs[j, i].get_yaxis().set_ticks([])
177 |
178 | # plt.tight_layout()
179 | # plt.savefig(os.path.join(img_folder, 'cifar_large_%s_%s.jpg' % (perturbation_type, args.affix)))
--------------------------------------------------------------------------------
/mnist/README.md:
--------------------------------------------------------------------------------
1 | # Adversarial Training and Visualization on MNIST
2 |
3 |
4 | ## Results
5 |
6 | ### Learning Curves
7 |
8 | Epsilon in linf (l2) training is 0.3 (1.5).
9 |
10 |
11 |
12 |
13 | Standard Training |
14 | l_inf Training |
15 | l_2 Training |
16 |
17 |
18 | |
19 | |
20 | |
21 |
22 |
23 | Standard Accuracy (train/test) |
24 | Robustness Accuracy (train/test) |
25 | Standard Accuracy (train/test) |
26 | Robustness Accuracy (train/test) |
27 | Standard Accuracy (train/test) |
28 | Robustness Accuracy (train/test) |
29 |
30 |
31 | 100.00/99.32 |
32 | 0.00/0.61 |
33 | 100.00/98.96 |
34 | 96.88/95.16 |
35 | 100.00/99.41 |
36 | 100.00/97.48 |
37 |
38 |
39 |
40 |
41 | Note that in testing mode, the target label used in creating the adversarial example is the most confident prediction of the model, not the ground truth. Therefore, sometimes the testing robustness is higher than training robustness, when the prediction is wrong at first.
42 |
43 | ### Visualization of Gradient with Respect to Input
44 |
45 | 
46 |
47 | ### The Adversarial Example with large epsilon
48 |
49 | The maximum epsilon is set to 4 (l2 norm) in this part.
50 |
51 | 
52 |
53 |
54 | ## Requirements:
55 | ```
56 | python >= 3.5
57 | torch == 1.0
58 | torchvision == 0.2.1
59 | numpy >= 1.16.1
60 | matplotlib >= 3.0.2
61 | ```
62 |
63 | ## Execution
64 |
65 | ### Training
66 |
67 | Standard training:
68 |
69 | ```
70 | python main.py --data_root [data directory]
71 | ```
72 |
73 | linf training:
74 |
75 | ```
76 | python main.py --data_root [data directory] -e 0.3 -p 'linf' --adv_train
77 | ```
78 |
79 | l2 training:
80 |
81 | ```
82 | python main.py --data_root [data directory] -e 1.5 -p 'l2' --adv_train
83 | ```
84 |
85 | ### Testing
86 |
87 | change the setting if you want to do linf testing.
88 | ```
89 | python main.py --todo test --data_root [data directory] -e 0.314 -p 'l2' --load_checkpoint [your_model.pth]
90 | ```
91 |
92 | ### Visualization
93 |
94 | change the setting in `visualize.py` `visualize_attack.py` and if you want to do linf visualization.
95 |
96 | visualize gradient to input:
97 |
98 | ```
99 | python visualize.py --load_checkpoint [your_model.pth]
100 | ```
101 |
102 | visualize adversarial examples with larger epsilon
103 |
104 | ```
105 | python visualize_attack.py --load_checkpoint [your_model.pth]
106 | ```
107 |
108 |
109 | ## Training Time
110 |
111 | Standard training: 0.64 s / 100 iterations
112 | Adversarial training: 16 s / 100 iterations
113 |
114 | where the batch size is 64 and train on NVIDIA GeForce GTX 1080.
115 |
--------------------------------------------------------------------------------
/mnist/img/mnist_grad_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_grad_default.jpg
--------------------------------------------------------------------------------
/mnist/img/mnist_large_l2_.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_large_l2_.jpg
--------------------------------------------------------------------------------
/mnist/img/mnist_learning_curve_l2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_l2.jpg
--------------------------------------------------------------------------------
/mnist/img/mnist_learning_curve_linf.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_linf.jpg
--------------------------------------------------------------------------------
/mnist/img/mnist_learning_curve_std.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_std.jpg
--------------------------------------------------------------------------------
/mnist/main.py:
--------------------------------------------------------------------------------
1 | import os
2 | import torch
3 | import torch.nn as nn
4 | import torch.nn.functional as F
5 | from torch.utils.data import DataLoader
6 |
7 | import torchvision as tv
8 |
9 | from time import time
10 | from model import Model
11 | from attack import FastGradientSignUntargeted
12 | from utils import makedirs, create_logger, tensor2cuda, numpy2cuda, evaluate, save_model
13 |
14 | from argument import parser, print_args
15 |
16 | class Trainer():
17 | def __init__(self, args, logger, attack):
18 | self.args = args
19 | self.logger = logger
20 | self.attack = attack
21 |
22 | def standard_train(self, model, tr_loader, va_loader=None):
23 | self.train(model, tr_loader, va_loader, False)
24 |
25 | def adversarial_train(self, model, tr_loader, va_loader=None):
26 | self.train(model, tr_loader, va_loader, True)
27 |
28 | def train(self, model, tr_loader, va_loader=None, adv_train=False):
29 | args = self.args
30 | logger = self.logger
31 |
32 | opt = torch.optim.Adam(model.parameters(), args.learning_rate)
33 |
34 | _iter = 0
35 |
36 | begin_time = time()
37 |
38 | for epoch in range(1, args.max_epoch+1):
39 | for data, label in tr_loader:
40 | data, label = tensor2cuda(data), tensor2cuda(label)
41 |
42 | if adv_train:
43 | # When training, the adversarial example is created from a random
44 | # close point to the original data point. If in evaluation mode,
45 | # just start from the original data point.
46 | adv_data = self.attack.perturb(data, label, 'mean', True)
47 | output = model(adv_data, _eval=False)
48 | else:
49 | output = model(data, _eval=False)
50 |
51 | loss = F.cross_entropy(output, label)
52 |
53 | opt.zero_grad()
54 | loss.backward()
55 | opt.step()
56 |
57 | if _iter % args.n_eval_step == 0:
58 |
59 | if adv_train:
60 | with torch.no_grad():
61 | stand_output = model(data, _eval=True)
62 | pred = torch.max(stand_output, dim=1)[1]
63 |
64 | # print(pred)
65 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
66 |
67 | pred = torch.max(output, dim=1)[1]
68 | # print(pred)
69 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
70 |
71 | else:
72 | adv_data = self.attack.perturb(data, label, 'mean', False)
73 |
74 | with torch.no_grad():
75 | adv_output = model(adv_data, _eval=True)
76 | pred = torch.max(adv_output, dim=1)[1]
77 | # print(label)
78 | # print(pred)
79 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
80 |
81 | pred = torch.max(output, dim=1)[1]
82 | # print(pred)
83 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
84 |
85 | # only calculating the training time
86 | logger.info('epoch: %d, iter: %d, spent %.2f s, tr_loss: %.3f' % (
87 | epoch, _iter, time() - begin_time, loss.item()))
88 |
89 | logger.info('standard acc: %.3f %%, robustness acc: %.3f %%' % (
90 | std_acc, adv_acc))
91 |
92 | if va_loader is not None:
93 | va_acc, va_adv_acc = self.test(model, va_loader, True)
94 | va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0
95 |
96 | logger.info('\n' + '='*30 + ' evaluation ' + '='*30)
97 | logger.info('test acc: %.3f %%, test adv acc: %.3f %%' % (
98 | va_acc, va_adv_acc))
99 | logger.info('='*28 + ' end of evaluation ' + '='*28 + '\n')
100 |
101 | begin_time = time()
102 |
103 | if _iter % args.n_store_image_step == 0:
104 | tv.utils.save_image(torch.cat([data.cpu(), adv_data.cpu()], dim=0),
105 | os.path.join(args.log_folder, 'images_%d.jpg' % _iter),
106 | nrow=16)
107 |
108 |
109 | if _iter % args.n_checkpoint_step == 0:
110 | file_name = os.path.join(args.model_folder, 'checkpoint_%d.pth' % _iter)
111 | save_model(model, file_name)
112 |
113 | _iter += 1
114 |
115 | def test(self, model, loader, adv_test=False):
116 | # adv_test is False, return adv_acc as -1
117 |
118 | total_acc = 0.0
119 | num = 0
120 | total_adv_acc = 0.0
121 |
122 | with torch.no_grad():
123 | for data, label in loader:
124 | data, label = tensor2cuda(data), tensor2cuda(label)
125 |
126 | output = model(data, _eval=True)
127 |
128 | pred = torch.max(output, dim=1)[1]
129 | te_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy(), 'sum')
130 |
131 | total_acc += te_acc
132 | num += output.shape[0]
133 |
134 | if adv_test:
135 | # use predicted label as target label
136 | # with torch.enable_grad():
137 | adv_data = self.attack.perturb(data, pred, 'mean', False)
138 |
139 | adv_output = model(adv_data, _eval=True)
140 |
141 | adv_pred = torch.max(adv_output, dim=1)[1]
142 | adv_acc = evaluate(adv_pred.cpu().numpy(), label.cpu().numpy(), 'sum')
143 | total_adv_acc += adv_acc
144 | else:
145 | total_adv_acc = -num
146 |
147 | return total_acc / num , total_adv_acc / num
148 |
149 | def main(args):
150 |
151 | save_folder = '%s_%s' % (args.dataset, args.affix)
152 |
153 | log_folder = os.path.join(args.log_root, save_folder)
154 | model_folder = os.path.join(args.model_root, save_folder)
155 |
156 | makedirs(log_folder)
157 | makedirs(model_folder)
158 |
159 | setattr(args, 'log_folder', log_folder)
160 | setattr(args, 'model_folder', model_folder)
161 |
162 | logger = create_logger(log_folder, args.todo, 'info')
163 |
164 | print_args(args, logger)
165 |
166 | model = Model(i_c=1, n_c=10)
167 |
168 | attack = FastGradientSignUntargeted(model,
169 | args.epsilon,
170 | args.alpha,
171 | min_val=0,
172 | max_val=1,
173 | max_iters=args.k,
174 | _type=args.perturbation_type)
175 |
176 | if torch.cuda.is_available():
177 | model.cuda()
178 |
179 | trainer = Trainer(args, logger, attack)
180 |
181 | if args.todo == 'train':
182 | tr_dataset = tv.datasets.MNIST(args.data_root,
183 | train=True,
184 | transform=tv.transforms.ToTensor(),
185 | download=True)
186 |
187 | tr_loader = DataLoader(tr_dataset, batch_size=args.batch_size, shuffle=True, num_workers=4)
188 |
189 | # evaluation during training
190 | te_dataset = tv.datasets.MNIST(args.data_root,
191 | train=False,
192 | transform=tv.transforms.ToTensor(),
193 | download=True)
194 |
195 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
196 |
197 | trainer.train(model, tr_loader, te_loader, args.adv_train)
198 | elif args.todo == 'test':
199 | pass
200 | else:
201 | raise NotImplementedError
202 |
203 |
204 |
205 |
206 | if __name__ == '__main__':
207 | args = parser()
208 |
209 | os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
210 |
211 | main(args)
--------------------------------------------------------------------------------
/mnist/src/argument.py:
--------------------------------------------------------------------------------
1 | import argparse
2 |
3 | def parser():
4 | parser = argparse.ArgumentParser(description='Video Summarization')
5 | parser.add_argument('--todo', choices=['train', 'valid', 'test', 'visualize'], default='train',
6 | help='what behavior want to do: train | valid | test | visualize')
7 | parser.add_argument('--dataset', default='mnist', help='use what dataset')
8 | parser.add_argument('--data_root', default='/home/yilin/Data',
9 | help='the directory to save the dataset')
10 | parser.add_argument('--log_root', default='log',
11 | help='the directory to save the logs or other imformations (e.g. images)')
12 | parser.add_argument('--model_root', default='checkpoint', help='the directory to save the models')
13 | parser.add_argument('--load_checkpoint', default='./model/default/model.pth')
14 | parser.add_argument('--affix', default='', help='the affix for the save folder')
15 |
16 | # parameters for generating adversarial examples
17 | parser.add_argument('--epsilon', '-e', type=float, default=0.3,
18 | help='maximum perturbation of adversaries')
19 | parser.add_argument('--alpha', '-a', type=float, default=0.01,
20 | help='movement multiplier per iteration when generating adversarial examples')
21 | parser.add_argument('--k', '-k', type=int, default=40,
22 | help='maximum iteration when generating adversarial examples')
23 |
24 |
25 |
26 | parser.add_argument('--batch_size', '-b', type=int, default=64, help='batch size')
27 | parser.add_argument('--max_epoch', '-m_e', type=int, default=60,
28 | help='the maximum numbers of the model see a sample')
29 | parser.add_argument('--learning_rate', '-lr', type=float, default=1e-4, help='learning rate')
30 |
31 | parser.add_argument('--gpu', '-g', default='0', help='which gpu to use')
32 | parser.add_argument('--n_eval_step', type=int, default=100,
33 | help='number of iteration per one evaluation')
34 | parser.add_argument('--n_checkpoint_step', type=int, default=2000,
35 | help='number of iteration to save a checkpoint')
36 | parser.add_argument('--n_store_image_step', type=int, default=2000,
37 | help='number of iteration to save adversaries')
38 | parser.add_argument('--perturbation_type', '-p', choices=['linf', 'l2'], default='linf',
39 | help='the type of the perturbation (linf or l2)')
40 |
41 | parser.add_argument('--adv_train', action='store_true')
42 |
43 | return parser.parse_args()
44 |
45 | def print_args(args, logger=None):
46 | for k, v in vars(args).items():
47 | if logger is not None:
48 | logger.info('{:<16} : {}'.format(k, v))
49 | else:
50 | print('{:<16} : {}'.format(k, v))
--------------------------------------------------------------------------------
/mnist/src/attack/__init__.py:
--------------------------------------------------------------------------------
1 | from .fast_gradient_sign_untargeted import FastGradientSignUntargeted
--------------------------------------------------------------------------------
/mnist/src/attack/fast_gradient_sign_untargeted.py:
--------------------------------------------------------------------------------
1 | """
2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks
3 |
4 | original author: Utku Ozbulak - github.com/utkuozbulak
5 | """
6 | import sys
7 | sys.path.append("..")
8 |
9 | import os
10 | import numpy as np
11 |
12 | import torch
13 | from torch import nn
14 | import torch.nn.functional as F
15 |
16 | from utils import tensor2cuda
17 |
18 | def project(x, original_x, epsilon, _type='linf'):
19 |
20 | if _type == 'linf':
21 | max_x = original_x + epsilon
22 | min_x = original_x - epsilon
23 |
24 | x = torch.max(torch.min(x, max_x), min_x)
25 |
26 | elif _type == 'l2':
27 | dist = (x - original_x)
28 |
29 | dist = dist.view(x.shape[0], -1)
30 |
31 | dist_norm = torch.norm(dist, dim=1, keepdim=True)
32 |
33 | mask = (dist_norm > epsilon).unsqueeze(2).unsqueeze(3)
34 |
35 | # dist = F.normalize(dist, p=2, dim=1)
36 |
37 | dist = dist / dist_norm
38 |
39 | dist *= epsilon
40 |
41 | dist = dist.view(x.shape)
42 |
43 | x = (original_x + dist) * mask.float() + x * (1 - mask.float())
44 |
45 | else:
46 | raise NotImplementedError
47 |
48 | return x
49 |
50 | class FastGradientSignUntargeted():
51 | b"""
52 | Fast gradient sign untargeted adversarial attack, minimizes the initial class activation
53 | with iterative grad sign updates
54 | """
55 | def __init__(self, model, epsilon, alpha, min_val, max_val, max_iters, _type='linf'):
56 | self.model = model
57 | # self.model.eval()
58 |
59 | # Maximum perturbation
60 | self.epsilon = epsilon
61 | # Movement multiplier per iteration
62 | self.alpha = alpha
63 | # Minimum value of the pixels
64 | self.min_val = min_val
65 | # Maximum value of the pixels
66 | self.max_val = max_val
67 | # Maximum numbers of iteration to generated adversaries
68 | self.max_iters = max_iters
69 | # The perturbation of epsilon
70 | self._type = _type
71 |
72 | def perturb(self, original_images, labels, reduction4loss='mean', random_start=False):
73 | # original_images: values are within self.min_val and self.max_val
74 |
75 | # The adversaries created from random close points to the original data
76 | if random_start:
77 | rand_perturb = torch.FloatTensor(original_images.shape).uniform_(
78 | -self.epsilon, self.epsilon)
79 | rand_perturb = tensor2cuda(rand_perturb)
80 | x = original_images + rand_perturb
81 | x.clamp_(self.min_val, self.max_val)
82 | else:
83 | x = original_images.clone()
84 |
85 | x.requires_grad = True
86 |
87 | # max_x = original_images + self.epsilon
88 | # min_x = original_images - self.epsilon
89 |
90 | with torch.enable_grad():
91 | for _iter in range(self.max_iters):
92 | outputs = self.model(x, _eval=True)
93 |
94 | loss = F.cross_entropy(outputs, labels, reduction=reduction4loss)
95 |
96 | if reduction4loss == 'none':
97 | grad_outputs = tensor2cuda(torch.ones(loss.shape))
98 |
99 | else:
100 | grad_outputs = None
101 |
102 | grads = torch.autograd.grad(loss, x, grad_outputs=grad_outputs,
103 | only_inputs=True)[0]
104 |
105 | x.data += self.alpha * torch.sign(grads.data)
106 |
107 | # the adversaries' pixel value should within max_x and min_x due
108 | # to the l_infinity / l2 restriction
109 | x = project(x, original_images, self.epsilon, self._type)
110 | # the adversaries' value should be valid pixel value
111 | x.clamp_(self.min_val, self.max_val)
112 |
113 | return x
114 |
--------------------------------------------------------------------------------
/mnist/src/model/__init__.py:
--------------------------------------------------------------------------------
1 | from .model import *
--------------------------------------------------------------------------------
/mnist/src/model/model.py:
--------------------------------------------------------------------------------
1 | import torch
2 | import torch.nn as nn
3 |
4 | class Expression(nn.Module):
5 | def __init__(self, func):
6 | super(Expression, self).__init__()
7 | self.func = func
8 |
9 | def forward(self, input):
10 | return self.func(input)
11 |
12 | class Model(nn.Module):
13 | def __init__(self, i_c=1, n_c=10):
14 | super(Model, self).__init__()
15 |
16 | self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True)
17 | self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
18 |
19 | self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True)
20 | self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
21 |
22 |
23 | self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1))
24 | self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True)
25 | self.fc2 = nn.Linear(1024, n_c)
26 |
27 |
28 | def forward(self, x_i, _eval=False):
29 |
30 | if _eval:
31 | # switch to eval mode
32 | self.eval()
33 | else:
34 | self.train()
35 |
36 | x_o = self.conv1(x_i)
37 | x_o = torch.relu(x_o)
38 | x_o = self.pool1(x_o)
39 |
40 | x_o = self.conv2(x_o)
41 | x_o = torch.relu(x_o)
42 | x_o = self.pool2(x_o)
43 |
44 | x_o = self.flatten(x_o)
45 |
46 | x_o = torch.relu(self.fc1(x_o))
47 |
48 | self.train()
49 |
50 | return self.fc2(x_o)
51 |
52 |
53 | if __name__ == '__main__':
54 | i = torch.FloatTensor(4, 1, 28, 28)
55 |
56 | n = Model()
57 |
58 | print(n(i).size())
59 |
60 |
--------------------------------------------------------------------------------
/mnist/src/read_log.py:
--------------------------------------------------------------------------------
1 | import re
2 | import os
3 | from utils import makedirs
4 | import matplotlib.pyplot as plt
5 |
6 | file_name = '../log/mnist_l2_adv/train_log.txt'
7 | affix = 'l2'
8 | title = r'$l_2$ Training'
9 |
10 | img_folder = '../img'
11 | makedirs(img_folder)
12 |
13 | train_iter_list = []
14 | train_acc_list = []
15 | train_rob_list = []
16 |
17 | test_iter_list = []
18 | test_acc_list = []
19 | test_rob_list = []
20 |
21 | with open(file_name, 'r') as f:
22 | lines = f.readlines()
23 |
24 | for line in lines:
25 | splits = re.split('[, =%:\n]+', line)
26 |
27 | if splits[0] == 'epoch':
28 | _iter = int(splits[3])
29 | train_iter_list.append(_iter)
30 |
31 | if splits[0] == 'standard':
32 | train_acc_list.append(float(splits[2]))
33 | train_rob_list.append(float(splits[5]))
34 |
35 | if splits[0] == 'test':
36 | test_iter_list.append(_iter)
37 | test_acc_list.append(float(splits[2]))
38 | test_rob_list.append(float(splits[6]))
39 |
40 |
41 | a_1 = plt.plot(train_iter_list, train_acc_list , color='r', label='train standard accuary')[0]
42 | a_2 = plt.plot(test_iter_list, test_acc_list , color='r', linestyle='--', label='test standard accuary')[0]
43 |
44 | b_1 = plt.plot(train_iter_list, train_rob_list , color='b', label='train robust accuary')[0]
45 | b_2 = plt.plot(test_iter_list, test_rob_list , color='b', linestyle='--', label='test robust accuary')[0]
46 |
47 | plt.title(title)
48 |
49 | plt.legend(handles=[a_1, a_2, b_1, b_2])
50 |
51 | plt.savefig(os.path.join(img_folder, 'mnist_learning_curve_%s.jpg' % affix))
--------------------------------------------------------------------------------
/mnist/src/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .utils import *
--------------------------------------------------------------------------------
/mnist/src/utils/utils.py:
--------------------------------------------------------------------------------
1 | import os
2 | import json
3 | import logging
4 |
5 | import numpy as np
6 |
7 | import torch
8 |
9 |
10 | def list2cuda(_list):
11 | array = np.array(_list)
12 | return numpy2cuda(array)
13 |
14 | def numpy2cuda(array):
15 | tensor = torch.from_numpy(array)
16 |
17 | return tensor2cuda(tensor)
18 |
19 | def tensor2cuda(tensor):
20 | if torch.cuda.is_available():
21 | tensor = tensor.cuda()
22 |
23 | return tensor
24 |
25 | def one_hot(ids, n_class):
26 | # ---------------------
27 | # author:ke1th
28 | # source:CSDN
29 | # artical:https://blog.csdn.net/u012436149/article/details/77017832
30 | b"""
31 | ids: (list, ndarray) shape:[batch_size]
32 | out_tensor:FloatTensor shape:[batch_size, depth]
33 | """
34 |
35 | assert len(ids.shape) == 1, 'the ids should be 1-D'
36 | # ids = torch.LongTensor(ids).view(-1,1)
37 |
38 | out_tensor = torch.zeros(len(ids), n_class)
39 |
40 | out_tensor.scatter_(1, ids.cpu().unsqueeze(1), 1.)
41 |
42 | return out_tensor
43 |
44 | def evaluate(_input, _target, method='mean'):
45 | correct = (_input == _target).astype(np.float32)
46 | if method == 'mean':
47 | return correct.mean()
48 | else:
49 | return correct.sum()
50 |
51 |
52 | def create_logger(save_path='', file_type='', level='debug'):
53 |
54 | if level == 'debug':
55 | _level = logging.DEBUG
56 | elif level == 'info':
57 | _level = logging.INFO
58 |
59 | logger = logging.getLogger()
60 | logger.setLevel(_level)
61 |
62 | cs = logging.StreamHandler()
63 | cs.setLevel(_level)
64 | logger.addHandler(cs)
65 |
66 | if save_path != '':
67 | file_name = os.path.join(save_path, file_type + '_log.txt')
68 | fh = logging.FileHandler(file_name, mode='w')
69 | fh.setLevel(_level)
70 |
71 | logger.addHandler(fh)
72 |
73 | return logger
74 |
75 | def makedirs(path):
76 | if not os.path.exists(path):
77 | os.makedirs(path)
78 |
79 | def load_model(model, file_name):
80 | model.load_state_dict(
81 | torch.load(file_name, map_location=lambda storage, loc: storage))
82 |
83 | def save_model(model, file_name):
84 | torch.save(model.state_dict(), file_name)
--------------------------------------------------------------------------------
/mnist/src/visualization/__init__.py:
--------------------------------------------------------------------------------
1 | from .vanilla_backprop import VanillaBackprop
--------------------------------------------------------------------------------
/mnist/src/visualization/vanilla_backprop.py:
--------------------------------------------------------------------------------
1 | """
2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks
3 |
4 | original author: Utku Ozbulak - github.com/utkuozbulak
5 | """
6 |
7 | import sys
8 | sys.path.append("..")
9 |
10 | import torch
11 |
12 | from utils import tensor2cuda, one_hot
13 |
14 | class VanillaBackprop():
15 | """
16 | Produces gradients generated with vanilla back propagation from the image
17 | """
18 | def __init__(self, model):
19 | self.model = model
20 |
21 | def generate_gradients(self, input_image, target_class):
22 | # Put model in evaluation mode
23 | self.model.eval()
24 |
25 | x = input_image.clone()
26 |
27 | x.requires_grad = True
28 |
29 | # Forward
30 | model_output = self.model(x)
31 | # Zero grads
32 | self.model.zero_grad()
33 |
34 | grad_outputs = one_hot(target_class, model_output.shape[1])
35 | grad_outputs = tensor2cuda(grad_outputs)
36 |
37 | grad = torch.autograd.grad(model_output, x, grad_outputs=grad_outputs,
38 | only_inputs=True)[0]
39 |
40 | self.model.train()
41 |
42 | return grad
43 |
--------------------------------------------------------------------------------
/mnist/visualize.py:
--------------------------------------------------------------------------------
1 |
2 | import os
3 | import torch
4 | import torchvision as tv
5 | import numpy as np
6 |
7 | from torch.utils.data import DataLoader
8 |
9 | from utils import makedirs, tensor2cuda, load_model
10 | from argument import parser
11 | from visualization import VanillaBackprop
12 | from model import Model
13 |
14 | import matplotlib.pyplot as plt
15 |
16 | img_folder = '../img'
17 | makedirs(img_folder)
18 |
19 | args = parser()
20 |
21 |
22 | te_dataset = tv.datasets.MNIST(args.data_root,
23 | train=False,
24 | transform=tv.transforms.ToTensor(),
25 | download=True)
26 |
27 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
28 |
29 |
30 | for data, label in te_loader:
31 |
32 | data, label = tensor2cuda(data), tensor2cuda(label)
33 |
34 |
35 | break
36 |
37 | types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
38 |
39 |
40 | model_checkpoints = ['../checkpoint/mnist_std_train/checkpoint_56000.pth',
41 | '../checkpoint/mnist_adv_train/checkpoint_56000.pth',
42 | '../checkpoint/mnist_l2_adv/checkpoint_56000.pth']
43 |
44 |
45 | out_list = []
46 |
47 | for checkpoint in model_checkpoints:
48 |
49 | model = Model(i_c=1, n_c=10)
50 |
51 | load_model(model, checkpoint)
52 |
53 | if torch.cuda.is_available():
54 | model.cuda()
55 |
56 | VBP = VanillaBackprop(model)
57 |
58 | grad = VBP.generate_gradients(data, label)
59 |
60 | grad_flat = grad.view(grad.shape[0], -1)
61 | mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3)
62 | std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3)
63 |
64 | mean = mean.repeat(1, 1, data.shape[2], data.shape[3])
65 | std = std.repeat(1, 1, data.shape[2], data.shape[3])
66 |
67 | grad = torch.max(torch.min(grad, mean+3*std), mean-3*std)
68 |
69 | print(grad.min(), grad.max())
70 |
71 | grad -= grad.min()
72 |
73 | grad /= grad.max()
74 |
75 | grad = grad.cpu().numpy().squeeze() # (N, 28, 28)
76 |
77 | grad *= 255.0
78 |
79 | out_list.append(grad)
80 |
81 | data = data.cpu().numpy().squeeze() # (N, 28, 28)
82 | data *= 255.0
83 | label = label.cpu().numpy()
84 |
85 | out_list.insert(0, data)
86 |
87 | # normalize the grad
88 | # length = torch.norm(grad, dim=3)
89 | # length = torch.norm(length, dim=2)
90 | # length = length.unsqueeze(2).unsqueeze(2)
91 | # grad /= (length + 1e-5)
92 |
93 | out_num = 5
94 |
95 | fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num)
96 |
97 | axs = _axs
98 |
99 |
100 | for j, _type in enumerate(types):
101 | axs[j, 0].set_ylabel(_type)
102 |
103 | if j == 0:
104 | cmap = 'gray'
105 | else:
106 | cmap = 'seismic'
107 |
108 | for i in range(out_num):
109 | axs[j, i].set_xlabel('%d' % label[i])
110 | axs[j, i].imshow(out_list[j][i], cmap=cmap)
111 |
112 | axs[j, i].get_xaxis().set_ticks([])
113 | axs[j, i].get_yaxis().set_ticks([])
114 |
115 | plt.tight_layout()
116 | plt.savefig(os.path.join(img_folder, 'mnist_grad_%s.jpg' % args.affix))
--------------------------------------------------------------------------------
/mnist/visualize_attack.py:
--------------------------------------------------------------------------------
1 |
2 | import os
3 | import torch
4 | import torchvision as tv
5 | import numpy as np
6 |
7 | from torch.utils.data import DataLoader
8 |
9 | from utils import makedirs, tensor2cuda, load_model
10 | from argument import parser
11 | from visualization import VanillaBackprop
12 | from attack import FastGradientSignUntargeted
13 | from model import Model
14 |
15 | import matplotlib.pyplot as plt
16 |
17 | img_folder = '../img'
18 | makedirs(img_folder)
19 |
20 | args = parser()
21 |
22 |
23 | te_dataset = tv.datasets.MNIST(args.data_root,
24 | train=False,
25 | transform=tv.transforms.ToTensor(),
26 | download=True)
27 |
28 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
29 |
30 |
31 | for data, label in te_loader:
32 |
33 | data, label = tensor2cuda(data), tensor2cuda(label)
34 |
35 |
36 | break
37 |
38 | types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
39 |
40 |
41 | model_checkpoints = ['../checkpoint/mnist_std_train/checkpoint_56000.pth',
42 | '../checkpoint/mnist_adv_train/checkpoint_56000.pth',
43 | '../checkpoint/mnist_l2_adv/checkpoint_56000.pth']
44 |
45 | adv_list = []
46 | pred_list = []
47 |
48 | max_epsilon = 0.8
49 |
50 | perturbation_type = 'linf'
51 |
52 | with torch.no_grad():
53 | for checkpoint in model_checkpoints:
54 |
55 | model = Model(i_c=1, n_c=10)
56 |
57 | load_model(model, checkpoint)
58 |
59 | if torch.cuda.is_available():
60 | model.cuda()
61 |
62 | attack = FastGradientSignUntargeted(model,
63 | max_epsilon,
64 | args.alpha,
65 | min_val=0,
66 | max_val=1,
67 | max_iters=args.k,
68 | _type=perturbation_type)
69 |
70 |
71 | adv_data = attack.perturb(data, label, 'mean', False)
72 |
73 | output = model(adv_data, _eval=True)
74 | pred = torch.max(output, dim=1)[1]
75 | adv_list.append(adv_data.cpu().numpy().squeeze()) # (N, 28, 28)
76 | pred_list.append(pred.cpu().numpy())
77 |
78 | data = data.cpu().numpy().squeeze() # (N, 28, 28)
79 | data *= 255.0
80 | label = label.cpu().numpy()
81 |
82 | adv_list.insert(0, data)
83 |
84 | pred_list.insert(0, label)
85 |
86 | out_num = 5
87 |
88 | fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num)
89 |
90 | axs = _axs
91 |
92 | cmap = 'gray'
93 | for j, _type in enumerate(types):
94 | axs[j, 0].set_ylabel(_type)
95 |
96 | for i in range(out_num):
97 | axs[j, i].set_xlabel('%d' % pred_list[j][i])
98 | axs[j, i].imshow(adv_list[j][i], cmap=cmap)
99 |
100 | axs[j, i].get_xaxis().set_ticks([])
101 | axs[j, i].get_yaxis().set_ticks([])
102 |
103 | plt.tight_layout()
104 | plt.savefig(os.path.join(img_folder, 'mnist_large_%s_%s.jpg' % (perturbation_type, args.affix)))
--------------------------------------------------------------------------------