├── .gitignore ├── README.md ├── cifar-10 ├── README.md ├── img │ ├── cifar_grad_default.jpg │ ├── cifar_large_l2_default.jpg │ ├── cifar_large_linf_default.jpg │ ├── cifar_learning_curve_l2.jpg │ ├── cifar_learning_curve_linf.jpg │ └── cifar_learning_curve_std.jpg ├── main.py ├── src │ ├── argument.py │ ├── attack │ │ ├── __init__.py │ │ └── fast_gradient_sign_untargeted.py │ ├── model │ │ ├── __init__.py │ │ ├── madry_model.py │ │ └── model.py │ ├── utils │ │ ├── __init__.py │ │ └── utils.py │ └── visualization │ │ ├── __init__.py │ │ └── vanilla_backprop.py ├── train.sh ├── visualize.py └── visualize_attack.py └── mnist ├── README.md ├── img ├── mnist_grad_default.jpg ├── mnist_large_l2_.jpg ├── mnist_learning_curve_l2.jpg ├── mnist_learning_curve_linf.jpg └── mnist_learning_curve_std.jpg ├── main.py ├── src ├── argument.py ├── attack │ ├── __init__.py │ └── fast_gradient_sign_untargeted.py ├── model │ ├── __init__.py │ └── model.py ├── read_log.py ├── utils │ ├── __init__.py │ └── utils.py └── visualization │ ├── __init__.py │ └── vanilla_backprop.py ├── visualize.py └── visualize_attack.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | log/ 3 | 4 | # Compiled source # 5 | ################### 6 | *.com 7 | *.class 8 | *.dll 9 | *.exe 10 | *.o 11 | *.so 12 | sftp-config.json 13 | read_log.py 14 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Adversarial Training and Visualization 2 | 3 | The repo is the PyTorch-1.0 implementation for the adversarial training on MNIST/CIFAR-10. And I also reproduce part of the visualization results in [1].

4 | 5 | **Note**: Not an official implementation. 6 | 7 | ## Adversarial Training 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 |
Objective Function
Standard Training Adversarial Training
24 | 25 | where p in the table is usually 2 or inf.

26 | 27 | The objective of standard and adversarial training is fundamentally different. In standard training, the classifier minimize the loss computed from the original training data, while in adversarial training, it trains with the worst-case around the original data. 28 | 29 | ## Visualization 30 | 31 | In [1], the authors discover that the features learned by the robustness classifier are more human-perceivable. Related results are shown in mnist/cifar-10 folder. 32 | 33 | ## Implementation 34 | 35 | Part of the codes in this repo are borrowed/modified from [2], [3], [4] and [5]. 36 | 37 | ## References: 38 | 39 | [1] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry. *Robustness May Be at Odds with Accuracy*, https://arxiv.org/abs/1805.12152 40 | 41 | [2] https://github.com/MadryLab/mnist_challenge 42 | 43 | [3] https://github.com/MadryLab/cifar10_challenge 44 | 45 | [4] https://github.com/xternalz/WideResNet-pytorch 46 | 47 | [5] https://github.com/utkuozbulak/pytorch-cnn-visualizations 48 | 49 | 50 | ## Contact 51 | Yi-Lin Sung, corumlouis123@gmail.com 52 | -------------------------------------------------------------------------------- /cifar-10/README.md: -------------------------------------------------------------------------------- 1 | # Adversarial Training and Visualization on CIFAR-10 2 | 3 | 4 | ## Update 5 | * (2020/8/27) 6 | 1. To match the implementation of [madry_cifar10](https://github.com/MadryLab/cifar10_challenge), we update the default learning rate to `0.1`, the activation function of model to `LeakyReLU(0.1)`, and the optimizer change to `torch.SGD`. 7 | 2. Add new experiment in *Quantitative Results*, which match the results in [madry_cifar10](https://github.com/MadryLab/cifar10_challenge). 8 | 3. Add checkpoints for the updated model and delete the old ones. 9 | 4. Update codes structure. Pull `main.py`, `visualize.py` and `visualize_attack.py` out of `src` folder. 10 | * (2019/12/14) 11 | 1. Add `madry_model.py`, which contains the same model used in [madry_cifar10](https://github.com/MadryLab/cifar10_challenge), in `src/model`. 12 | 2. Add `count_parameters(model)` in `model.py` and `madry_model.py`, and it can compute the number of all the trainable parameters. 13 | 3. Flag `use_pseudo_label`, which determine whether to use model's prediction as the target, is added in to `trainer.test()`, and the default value is `False`. 14 | 4. Update the "Experiments" and the "Execution" part in this Readme. 15 | 5. Checkpoints of adversarial training in cifar-10 are provided. 16 | * (2019/4/18) Change the default alpha from 2 to 2/255, and update the results. 17 | 18 | ## Results 19 | 20 | Note that the experiments only conduct 1 time. 21 | 22 | ### Learning Curves 23 | 24 | Epsilon in linf (l2) training is 0.0157 (0.314). [0.0157=4/255, 0.314=80/255] 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 62 | 63 | 64 | 65 | 66 | 67 | 68 | 69 | 70 | 71 |
Standard Training l_inf Training l_2 Training
Standard Accuracy
(train/test)
Robustness Accuracy
(train/test)
Standard Accuracy
(train/test)
Robustness Accuracy
(train/test)
Standard Accuracy
(train/test)
Robustness Accuracy
(train/test)
92.19/87.14 0.00/7.85 79.69/78.09 61.72/63.8 89.84/85.39 76.56/77.76
Madry's Model
Standard Accuracy

(train/test)
Madry's Model
Robustness Accuracy

(train/test)
Madry's Model
Standard Accuracy

(train/test)
Madry's Model
Robustness Accuracy

(train/test)
Madry's Model
Standard Accuracy

(train/test)
Madry's Model
Robustness Accuracy

(train/test)
- - -/79.22 -/55.97 -/85.81 -/71.87
72 | 73 | (Only refer to those results which are not Madry's Model) Note that in testing mode, the target label used in creating the adversarial example is the most confident prediction of the model, not the ground truth. Therefore, sometimes the testing robustness is higher than training robustness, when the prediction is wrong at first.
74 | 75 | Learning rate is manually changed during training:
76 | 77 | * `0.1` in iteration `[0, 40000]` 78 | * `0.01` in iteration `[40000, 60000]` 79 | * `0.001` in iteration `[60000, 76000]` 80 | 81 | the policy is followed https://github.com/MadryLab/cifar10_challenge. 82 | 83 | 84 | ### Quantitative Results 85 | 86 | * Defense model, standard accuracy = 86.66% (linf, epsilon=8/255) (train on PGD attack with 7 steps of size 2) 87 | 88 | | Attack | Robust Test Accuracy | 89 | | :---: | :---: | 90 | | PGD with 10 steps of size 2 (cross-entropy) | 48.37% | 91 | | PGD with 20 steps of size 1 (cross-entropy) | 48.04% | 92 | 93 | ### Visualization of Gradient with Respect to Input 94 | 95 | ![visualization](https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_grad_default.jpg) 96 | 97 | ### The Adversarial Example with large epsilon 98 | 99 | The maximum epsilon is set to 4.7 (l2 norm) in this part. 100 | 101 | ![large](https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_large_l2_default.jpg) 102 | 103 | 104 | ## Requirements: 105 | ``` 106 | python >= 3.5 107 | torch >= 1.0 108 | torchvision >= 0.2.1 109 | numpy >= 1.16.1 110 | matplotlib >= 3.0.2 111 | ``` 112 | 113 | ## Execution 114 | 115 | ### Training 116 | 117 | Standard training:
118 | 119 | ``` 120 | python main.py --data_root [data directory] 121 | ``` 122 | 123 | linf training:
124 | 125 | ``` 126 | python main.py --data_root [data directory] -e 0.0157 -p 'linf' --adv_train --affix 'linf' 127 | ``` 128 | 129 | l2 training:
130 | 131 | ``` 132 | python main.py --data_root [data directory] -e 0.314 -p 'l2' --adv_train --affix 'l2' 133 | ``` 134 | 135 | ### Testing 136 | 137 | change the setting if you want to do linf testing. 138 | ``` 139 | python main.py --todo test --data_root [data directory] -e 0.314 -p 'l2' --load_checkpoint [your_model.pth] 140 | ``` 141 | 142 | ### Visualization 143 | 144 | change the setting in `visualize.py` `visualize_attack.py` and if you want to do linf visualization. 145 | 146 | visualize gradient to input:
147 | 148 | ``` 149 | python visualize.py --load_checkpoint [your_model.pth] 150 | ``` 151 | 152 | visualize adversarial examples with larger epsilon
153 | 154 | ``` 155 | python visualize_attack.py --load_checkpoint [your_model.pth] 156 | ``` 157 | 158 | ## Checkpoints 159 | ### linf 160 | * epsilon=8/255, train on PGD attack with 7 steps of size 2: [checkpoint](https://drive.google.com/file/d/1-3AfpkLvPje5poY9ZettY05N8kZgFRAV/view?usp=sharing)
161 | 162 | ## Training Time 163 | 164 | Standard training: 78 s / 100 iterations
165 | Adversarial training: 784 s / 100 iterations

166 | 167 | where the batch size is 128 and train on NVIDIA GeForce GTX 1080. 168 | -------------------------------------------------------------------------------- /cifar-10/img/cifar_grad_default.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_grad_default.jpg -------------------------------------------------------------------------------- /cifar-10/img/cifar_large_l2_default.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_large_l2_default.jpg -------------------------------------------------------------------------------- /cifar-10/img/cifar_large_linf_default.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_large_linf_default.jpg -------------------------------------------------------------------------------- /cifar-10/img/cifar_learning_curve_l2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_l2.jpg -------------------------------------------------------------------------------- /cifar-10/img/cifar_learning_curve_linf.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_linf.jpg -------------------------------------------------------------------------------- /cifar-10/img/cifar_learning_curve_std.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_std.jpg -------------------------------------------------------------------------------- /cifar-10/main.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | from torch.utils.data import DataLoader 6 | 7 | import torchvision as tv 8 | 9 | from time import time 10 | from src.model.madry_model import WideResNet 11 | from src.attack import FastGradientSignUntargeted 12 | from src.utils import makedirs, create_logger, tensor2cuda, numpy2cuda, evaluate, save_model 13 | 14 | from src.argument import parser, print_args 15 | 16 | class Trainer(): 17 | def __init__(self, args, logger, attack): 18 | self.args = args 19 | self.logger = logger 20 | self.attack = attack 21 | 22 | def standard_train(self, model, tr_loader, va_loader=None): 23 | self.train(model, tr_loader, va_loader, False) 24 | 25 | def adversarial_train(self, model, tr_loader, va_loader=None): 26 | self.train(model, tr_loader, va_loader, True) 27 | 28 | def train(self, model, tr_loader, va_loader=None, adv_train=False): 29 | args = self.args 30 | logger = self.logger 31 | 32 | opt = torch.optim.SGD(model.parameters(), args.learning_rate, 33 | weight_decay=args.weight_decay, 34 | momentum=args.momentum) 35 | scheduler = torch.optim.lr_scheduler.MultiStepLR(opt, 36 | milestones=[40000, 60000], 37 | gamma=0.1) 38 | _iter = 0 39 | 40 | begin_time = time() 41 | 42 | for epoch in range(1, args.max_epoch+1): 43 | for data, label in tr_loader: 44 | data, label = tensor2cuda(data), tensor2cuda(label) 45 | 46 | if adv_train: 47 | # When training, the adversarial example is created from a random 48 | # close point to the original data point. If in evaluation mode, 49 | # just start from the original data point. 50 | adv_data = self.attack.perturb(data, label, 'mean', True) 51 | output = model(adv_data, _eval=False) 52 | else: 53 | output = model(data, _eval=False) 54 | 55 | loss = F.cross_entropy(output, label) 56 | 57 | opt.zero_grad() 58 | loss.backward() 59 | opt.step() 60 | 61 | if _iter % args.n_eval_step == 0: 62 | t1 = time() 63 | 64 | if adv_train: 65 | with torch.no_grad(): 66 | stand_output = model(data, _eval=True) 67 | pred = torch.max(stand_output, dim=1)[1] 68 | 69 | # print(pred) 70 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 71 | 72 | pred = torch.max(output, dim=1)[1] 73 | # print(pred) 74 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 75 | 76 | else: 77 | 78 | adv_data = self.attack.perturb(data, label, 'mean', False) 79 | 80 | with torch.no_grad(): 81 | adv_output = model(adv_data, _eval=True) 82 | pred = torch.max(adv_output, dim=1)[1] 83 | # print(label) 84 | # print(pred) 85 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 86 | 87 | pred = torch.max(output, dim=1)[1] 88 | # print(pred) 89 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 90 | 91 | t2 = time() 92 | 93 | logger.info(f'epoch: {epoch}, iter: {_iter}, lr={opt.param_groups[0]["lr"]}, ' 94 | f'spent {time()-begin_time:.2f} s, tr_loss: {loss.item():.3f}') 95 | 96 | logger.info(f'standard acc: {std_acc:.3f}%, robustness acc: {adv_acc:.3f}%') 97 | 98 | # begin_time = time() 99 | 100 | # if va_loader is not None: 101 | # va_acc, va_adv_acc = self.test(model, va_loader, True) 102 | # va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0 103 | 104 | # logger.info('\n' + '='*30 + ' evaluation ' + '='*30) 105 | # logger.info('test acc: %.3f %%, test adv acc: %.3f %%, spent: %.3f' % ( 106 | # va_acc, va_adv_acc, time() - begin_time)) 107 | # logger.info('='*28 + ' end of evaluation ' + '='*28 + '\n') 108 | 109 | begin_time = time() 110 | 111 | if _iter % args.n_store_image_step == 0: 112 | tv.utils.save_image(torch.cat([data.cpu(), adv_data.cpu()], dim=0), 113 | os.path.join(args.log_folder, f'images_{_iter}.jpg'), 114 | nrow=16) 115 | 116 | if _iter % args.n_checkpoint_step == 0: 117 | file_name = os.path.join(args.model_folder, f'checkpoint_{_iter}.pth') 118 | save_model(model, file_name) 119 | 120 | _iter += 1 121 | # scheduler depends on training interation 122 | scheduler.step() 123 | 124 | if va_loader is not None: 125 | t1 = time() 126 | va_acc, va_adv_acc = self.test(model, va_loader, True, False) 127 | va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0 128 | 129 | t2 = time() 130 | logger.info('\n'+'='*20 +f' evaluation at epoch: {epoch} iteration: {_iter} ' \ 131 | +'='*20) 132 | logger.info(f'test acc: {va_acc:.3f}%, test adv acc: {va_adv_acc:.3f}%, spent: {t2-t1:.3f} s') 133 | logger.info('='*28+' end of evaluation '+'='*28+'\n') 134 | 135 | 136 | def test(self, model, loader, adv_test=False, use_pseudo_label=False): 137 | # adv_test is False, return adv_acc as -1 138 | 139 | total_acc = 0.0 140 | num = 0 141 | total_adv_acc = 0.0 142 | 143 | with torch.no_grad(): 144 | for data, label in loader: 145 | data, label = tensor2cuda(data), tensor2cuda(label) 146 | 147 | output = model(data, _eval=True) 148 | 149 | pred = torch.max(output, dim=1)[1] 150 | te_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy(), 'sum') 151 | 152 | total_acc += te_acc 153 | num += output.shape[0] 154 | 155 | if adv_test: 156 | # use predicted label as target label 157 | with torch.enable_grad(): 158 | adv_data = self.attack.perturb(data, 159 | pred if use_pseudo_label else label, 160 | 'mean', 161 | False) 162 | 163 | adv_output = model(adv_data, _eval=True) 164 | 165 | adv_pred = torch.max(adv_output, dim=1)[1] 166 | adv_acc = evaluate(adv_pred.cpu().numpy(), label.cpu().numpy(), 'sum') 167 | total_adv_acc += adv_acc 168 | else: 169 | total_adv_acc = -num 170 | 171 | return total_acc / num , total_adv_acc / num 172 | 173 | def main(args): 174 | 175 | save_folder = '%s_%s' % (args.dataset, args.affix) 176 | 177 | log_folder = os.path.join(args.log_root, save_folder) 178 | model_folder = os.path.join(args.model_root, save_folder) 179 | 180 | makedirs(log_folder) 181 | makedirs(model_folder) 182 | 183 | setattr(args, 'log_folder', log_folder) 184 | setattr(args, 'model_folder', model_folder) 185 | 186 | logger = create_logger(log_folder, args.todo, 'info') 187 | 188 | print_args(args, logger) 189 | 190 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 191 | 192 | attack = FastGradientSignUntargeted(model, 193 | args.epsilon, 194 | args.alpha, 195 | min_val=0, 196 | max_val=1, 197 | max_iters=args.k, 198 | _type=args.perturbation_type) 199 | 200 | if torch.cuda.is_available(): 201 | model.cuda() 202 | 203 | trainer = Trainer(args, logger, attack) 204 | 205 | if args.todo == 'train': 206 | transform_train = tv.transforms.Compose([ 207 | tv.transforms.RandomCrop(32, padding=4, fill=0, padding_mode='constant'), 208 | tv.transforms.RandomHorizontalFlip(), 209 | tv.transforms.ToTensor(), 210 | ]) 211 | tr_dataset = tv.datasets.CIFAR10(args.data_root, 212 | train=True, 213 | transform=transform_train, 214 | download=True) 215 | 216 | tr_loader = DataLoader(tr_dataset, batch_size=args.batch_size, shuffle=True, num_workers=4) 217 | 218 | # evaluation during training 219 | te_dataset = tv.datasets.CIFAR10(args.data_root, 220 | train=False, 221 | transform=tv.transforms.ToTensor(), 222 | download=True) 223 | 224 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 225 | 226 | trainer.train(model, tr_loader, te_loader, args.adv_train) 227 | elif args.todo == 'test': 228 | te_dataset = tv.datasets.CIFAR10(args.data_root, 229 | train=False, 230 | transform=tv.transforms.ToTensor(), 231 | download=True) 232 | 233 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 234 | 235 | checkpoint = torch.load(args.load_checkpoint) 236 | model.load_state_dict(checkpoint) 237 | 238 | std_acc, adv_acc = trainer.test(model, te_loader, adv_test=True, use_pseudo_label=False) 239 | 240 | print(f"std acc: {std_acc * 100:.3f}%, adv_acc: {adv_acc * 100:.3f}%") 241 | 242 | else: 243 | raise NotImplementedError 244 | 245 | 246 | 247 | 248 | if __name__ == '__main__': 249 | args = parser() 250 | 251 | os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu 252 | 253 | main(args) -------------------------------------------------------------------------------- /cifar-10/src/argument.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | def parser(): 4 | parser = argparse.ArgumentParser(description='Video Summarization') 5 | parser.add_argument('--todo', choices=['train', 'valid', 'test', 'visualize'], default='train', 6 | help='what behavior want to do: train | valid | test | visualize') 7 | parser.add_argument('--dataset', default='cifar-10', help='use what dataset') 8 | parser.add_argument('--data_root', default='/home/yilin/Data', 9 | help='the directory to save the dataset') 10 | parser.add_argument('--log_root', default='log', 11 | help='the directory to save the logs or other imformations (e.g. images)') 12 | parser.add_argument('--model_root', default='checkpoint', help='the directory to save the models') 13 | parser.add_argument('--load_checkpoint', default='./model/default/model.pth') 14 | parser.add_argument('--affix', default='default', help='the affix for the save folder') 15 | 16 | # parameters for generating adversarial examples 17 | parser.add_argument('--epsilon', '-e', type=float, default=0.0157, 18 | help='maximum perturbation of adversaries (4/255=0.0157)') 19 | parser.add_argument('--alpha', '-a', type=float, default=0.00784, 20 | help='movement multiplier per iteration when generating adversarial examples (2/255=0.00784)') 21 | parser.add_argument('--k', '-k', type=int, default=10, 22 | help='maximum iteration when generating adversarial examples') 23 | 24 | parser.add_argument('--batch_size', '-b', type=int, default=128, help='batch size') 25 | parser.add_argument('--max_epoch', '-m_e', type=int, default=200, 26 | help='the maximum numbers of the model see a sample') 27 | parser.add_argument('--learning_rate', '-lr', type=float, default=0.1, help='learning rate') 28 | parser.add_argument('--momentum', '-m', type=float, default=0.9, help='momentum for optimizer') 29 | parser.add_argument('--weight_decay', '-w', type=float, default=2e-4, 30 | help='the parameter of l2 restriction for weights') 31 | 32 | parser.add_argument('--gpu', '-g', default='0', help='which gpu to use') 33 | parser.add_argument('--n_eval_step', type=int, default=100, 34 | help='number of iteration per one evaluation') 35 | parser.add_argument('--n_checkpoint_step', type=int, default=4000, 36 | help='number of iteration to save a checkpoint') 37 | parser.add_argument('--n_store_image_step', type=int, default=4000, 38 | help='number of iteration to save adversaries') 39 | parser.add_argument('--perturbation_type', '-p', choices=['linf', 'l2'], default='linf', 40 | help='the type of the perturbation (linf or l2)') 41 | 42 | parser.add_argument('--adv_train', action='store_true') 43 | 44 | return parser.parse_args() 45 | 46 | def print_args(args, logger=None): 47 | for k, v in vars(args).items(): 48 | if logger is not None: 49 | logger.info('{:<16} : {}'.format(k, v)) 50 | else: 51 | print('{:<16} : {}'.format(k, v)) -------------------------------------------------------------------------------- /cifar-10/src/attack/__init__.py: -------------------------------------------------------------------------------- 1 | from .fast_gradient_sign_untargeted import FastGradientSignUntargeted -------------------------------------------------------------------------------- /cifar-10/src/attack/fast_gradient_sign_untargeted.py: -------------------------------------------------------------------------------- 1 | """ 2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks 3 | 4 | original author: Utku Ozbulak - github.com/utkuozbulak 5 | """ 6 | import sys 7 | sys.path.append("..") 8 | 9 | import os 10 | import numpy as np 11 | 12 | import torch 13 | from torch import nn 14 | import torch.nn.functional as F 15 | 16 | from src.utils import tensor2cuda 17 | 18 | def project(x, original_x, epsilon, _type='linf'): 19 | 20 | if _type == 'linf': 21 | max_x = original_x + epsilon 22 | min_x = original_x - epsilon 23 | 24 | x = torch.max(torch.min(x, max_x), min_x) 25 | 26 | elif _type == 'l2': 27 | dist = (x - original_x) 28 | 29 | dist = dist.view(x.shape[0], -1) 30 | 31 | dist_norm = torch.norm(dist, dim=1, keepdim=True) 32 | 33 | mask = (dist_norm > epsilon).unsqueeze(2).unsqueeze(3) 34 | 35 | # dist = F.normalize(dist, p=2, dim=1) 36 | 37 | dist = dist / dist_norm 38 | 39 | dist *= epsilon 40 | 41 | dist = dist.view(x.shape) 42 | 43 | x = (original_x + dist) * mask.float() + x * (1 - mask.float()) 44 | 45 | else: 46 | raise NotImplementedError 47 | 48 | return x 49 | 50 | class FastGradientSignUntargeted(): 51 | b""" 52 | Fast gradient sign untargeted adversarial attack, minimizes the initial class activation 53 | with iterative grad sign updates 54 | """ 55 | def __init__(self, model, epsilon, alpha, min_val, max_val, max_iters, _type='linf'): 56 | self.model = model 57 | # self.model.eval() 58 | 59 | # Maximum perturbation 60 | self.epsilon = epsilon 61 | # Movement multiplier per iteration 62 | self.alpha = alpha 63 | # Minimum value of the pixels 64 | self.min_val = min_val 65 | # Maximum value of the pixels 66 | self.max_val = max_val 67 | # Maximum numbers of iteration to generated adversaries 68 | self.max_iters = max_iters 69 | # The perturbation of epsilon 70 | self._type = _type 71 | 72 | def perturb(self, original_images, labels, reduction4loss='mean', random_start=False): 73 | # original_images: values are within self.min_val and self.max_val 74 | 75 | # The adversaries created from random close points to the original data 76 | if random_start: 77 | rand_perturb = torch.FloatTensor(original_images.shape).uniform_( 78 | -self.epsilon, self.epsilon) 79 | rand_perturb = tensor2cuda(rand_perturb) 80 | x = original_images + rand_perturb 81 | x.clamp_(self.min_val, self.max_val) 82 | else: 83 | x = original_images.clone() 84 | 85 | x.requires_grad = True 86 | 87 | # max_x = original_images + self.epsilon 88 | # min_x = original_images - self.epsilon 89 | 90 | self.model.eval() 91 | 92 | with torch.enable_grad(): 93 | for _iter in range(self.max_iters): 94 | outputs = self.model(x, _eval=True) 95 | 96 | loss = F.cross_entropy(outputs, labels, reduction=reduction4loss) 97 | 98 | if reduction4loss == 'none': 99 | grad_outputs = tensor2cuda(torch.ones(loss.shape)) 100 | 101 | else: 102 | grad_outputs = None 103 | 104 | grads = torch.autograd.grad(loss, x, grad_outputs=grad_outputs, 105 | only_inputs=True)[0] 106 | 107 | x.data += self.alpha * torch.sign(grads.data) 108 | 109 | # the adversaries' pixel value should within max_x and min_x due 110 | # to the l_infinity / l2 restriction 111 | x = project(x, original_images, self.epsilon, self._type) 112 | # the adversaries' value should be valid pixel value 113 | x.clamp_(self.min_val, self.max_val) 114 | 115 | self.model.train() 116 | 117 | return x 118 | -------------------------------------------------------------------------------- /cifar-10/src/model/__init__.py: -------------------------------------------------------------------------------- 1 | from .model import * -------------------------------------------------------------------------------- /cifar-10/src/model/madry_model.py: -------------------------------------------------------------------------------- 1 | # codes are import from https://github.com/xternalz/WideResNet-pytorch/blob/master/wideresnet.py 2 | # original author: xternalz 3 | 4 | import math 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | 9 | from src.utils import count_parameters 10 | 11 | class Expression(nn.Module): 12 | def __init__(self, func): 13 | super(Expression, self).__init__() 14 | self.func = func 15 | 16 | def forward(self, input): 17 | return self.func(input) 18 | 19 | class Model(nn.Module): 20 | def __init__(self, i_c=1, n_c=10): 21 | super(Model, self).__init__() 22 | 23 | self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True) 24 | self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0) 25 | 26 | self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True) 27 | self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0) 28 | 29 | 30 | self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1)) 31 | self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True) 32 | self.fc2 = nn.Linear(1024, n_c) 33 | 34 | 35 | def forward(self, x_i, _eval=False): 36 | 37 | if _eval: 38 | # switch to eval mode 39 | self.eval() 40 | else: 41 | self.train() 42 | 43 | x_o = self.conv1(x_i) 44 | x_o = torch.relu(x_o) 45 | x_o = self.pool1(x_o) 46 | 47 | x_o = self.conv2(x_o) 48 | x_o = torch.relu(x_o) 49 | x_o = self.pool2(x_o) 50 | 51 | x_o = self.flatten(x_o) 52 | 53 | x_o = torch.relu(self.fc1(x_o)) 54 | 55 | self.train() 56 | 57 | return self.fc2(x_o) 58 | 59 | class ChannelPadding(nn.Module): 60 | def __init__(self, in_planes, out_planes): 61 | super(ChannelPadding, self).__init__() 62 | 63 | self.register_buffer("padding", 64 | torch.zeros((out_planes - in_planes) // 2).view(1, -1, 1, 1)) 65 | 66 | def forward(self, input): 67 | assert len(input.size()) == 4, "only support for 4-D tensor for now" 68 | 69 | padding = self.padding.expand(input.size(0), -1, input.size(2), input.size(3)) 70 | 71 | return torch.cat([padding, input, padding], dim=1) 72 | 73 | class BasicBlock(nn.Module): 74 | def __init__(self, in_planes, out_planes, stride, dropRate=0.0): 75 | super(BasicBlock, self).__init__() 76 | self.bn1 = nn.BatchNorm2d(in_planes) 77 | self.relu1 = nn.LeakyReLU(0.1, inplace=True) 78 | self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 79 | padding=1, bias=False) 80 | self.bn2 = nn.BatchNorm2d(out_planes) 81 | self.relu2 = nn.LeakyReLU(0.1, inplace=True) 82 | self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1, 83 | padding=1, bias=False) 84 | self.droprate = dropRate 85 | self.equalInOut = (in_planes == out_planes) 86 | # self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, 87 | # padding=0, bias=False) or None 88 | self.poolpadShortcut = nn.Sequential( 89 | nn.AvgPool2d(kernel_size=stride, stride=stride), 90 | ChannelPadding(in_planes, out_planes) 91 | ) 92 | def forward(self, x): 93 | if not self.equalInOut: 94 | x = self.relu1(self.bn1(x)) 95 | else: 96 | out = self.relu1(self.bn1(x)) 97 | out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x))) 98 | if self.droprate > 0: 99 | out = F.dropout(out, p=self.droprate, training=self.training) 100 | out = self.conv2(out) 101 | # return torch.add(x if self.equalInOut else self.convShortcut(x), out) 102 | return torch.add( 103 | x if self.equalInOut else self.poolpadShortcut(x), 104 | out 105 | ) 106 | 107 | class NetworkBlock(nn.Module): 108 | def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0): 109 | super(NetworkBlock, self).__init__() 110 | self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate) 111 | def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate): 112 | layers = [] 113 | for i in range(int(nb_layers)): 114 | layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate)) 115 | return nn.Sequential(*layers) 116 | def forward(self, x): 117 | return self.layer(x) 118 | 119 | class WideResNet(nn.Module): 120 | def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0): 121 | super(WideResNet, self).__init__() 122 | nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor] 123 | assert((depth - 4) % 6 == 0) 124 | n = (depth - 4) / 6 125 | block = BasicBlock 126 | # 1st conv before any network block 127 | self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1, 128 | padding=1, bias=False) 129 | # 1st block 130 | self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate) 131 | # 2nd block 132 | self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate) 133 | # 3rd block 134 | self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate) 135 | # global average pooling and classifier 136 | self.bn1 = nn.BatchNorm2d(nChannels[3]) 137 | self.relu = nn.LeakyReLU(0.1, inplace=True) 138 | self.fc = nn.Linear(nChannels[3], num_classes) 139 | self.nChannels = nChannels[3] 140 | 141 | for m in self.modules(): 142 | if isinstance(m, nn.Conv2d): 143 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 144 | m.weight.data.normal_(0, math.sqrt(2. / n)) 145 | elif isinstance(m, nn.BatchNorm2d): 146 | m.weight.data.fill_(1) 147 | m.bias.data.zero_() 148 | elif isinstance(m, nn.Linear): 149 | m.bias.data.zero_() 150 | 151 | def forward(self, x, _eval=False): 152 | if _eval: 153 | # switch to eval mode 154 | self.eval() 155 | else: 156 | self.train() 157 | 158 | out = self.conv1(x) 159 | out = self.block1(out) 160 | out = self.block2(out) 161 | out = self.block3(out) 162 | out = self.relu(self.bn1(out)) 163 | out = F.avg_pool2d(out, 8) 164 | out = out.view(-1, self.nChannels) 165 | 166 | self.train() 167 | 168 | return self.fc(out) 169 | 170 | 171 | if __name__ == '__main__': 172 | i = torch.FloatTensor(4, 3, 32, 32) 173 | 174 | n = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 175 | 176 | i = i.cuda() 177 | n = n.cuda() 178 | 179 | print(n(i).size()) 180 | 181 | print(count_parameters(n)) 182 | 183 | -------------------------------------------------------------------------------- /cifar-10/src/model/model.py: -------------------------------------------------------------------------------- 1 | # codes are import from https://github.com/xternalz/WideResNet-pytorch/blob/master/wideresnet.py 2 | # original author: xternalz 3 | 4 | import math 5 | import torch 6 | import torch.nn as nn 7 | import torch.nn.functional as F 8 | 9 | from src.utils import count_parameters 10 | 11 | class Expression(nn.Module): 12 | def __init__(self, func): 13 | super(Expression, self).__init__() 14 | self.func = func 15 | 16 | def forward(self, input): 17 | return self.func(input) 18 | 19 | class Model(nn.Module): 20 | def __init__(self, i_c=1, n_c=10): 21 | super(Model, self).__init__() 22 | 23 | self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True) 24 | self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0) 25 | 26 | self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True) 27 | self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0) 28 | 29 | 30 | self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1)) 31 | self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True) 32 | self.fc2 = nn.Linear(1024, n_c) 33 | 34 | 35 | def forward(self, x_i, _eval=False): 36 | 37 | if _eval: 38 | # switch to eval mode 39 | self.eval() 40 | else: 41 | self.train() 42 | 43 | x_o = self.conv1(x_i) 44 | x_o = torch.relu(x_o) 45 | x_o = self.pool1(x_o) 46 | 47 | x_o = self.conv2(x_o) 48 | x_o = torch.relu(x_o) 49 | x_o = self.pool2(x_o) 50 | 51 | x_o = self.flatten(x_o) 52 | 53 | x_o = torch.relu(self.fc1(x_o)) 54 | 55 | self.train() 56 | 57 | return self.fc2(x_o) 58 | 59 | 60 | 61 | 62 | 63 | class BasicBlock(nn.Module): 64 | def __init__(self, in_planes, out_planes, stride, dropRate=0.0): 65 | super(BasicBlock, self).__init__() 66 | self.bn1 = nn.BatchNorm2d(in_planes) 67 | self.relu1 = nn.LeakyReLU(0.1, inplace=True) 68 | self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, 69 | padding=1, bias=False) 70 | self.bn2 = nn.BatchNorm2d(out_planes) 71 | self.relu2 = nn.LeakyReLU(0.1, inplace=True) 72 | self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1, 73 | padding=1, bias=False) 74 | self.droprate = dropRate 75 | self.equalInOut = (in_planes == out_planes) 76 | self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride, 77 | padding=0, bias=False) or None 78 | def forward(self, x): 79 | if not self.equalInOut: 80 | x = self.relu1(self.bn1(x)) 81 | else: 82 | out = self.relu1(self.bn1(x)) 83 | out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x))) 84 | if self.droprate > 0: 85 | out = F.dropout(out, p=self.droprate, training=self.training) 86 | out = self.conv2(out) 87 | return torch.add(x if self.equalInOut else self.convShortcut(x), out) 88 | 89 | class NetworkBlock(nn.Module): 90 | def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0): 91 | super(NetworkBlock, self).__init__() 92 | self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate) 93 | def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate): 94 | layers = [] 95 | for i in range(int(nb_layers)): 96 | layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate)) 97 | return nn.Sequential(*layers) 98 | def forward(self, x): 99 | return self.layer(x) 100 | 101 | class WideResNet(nn.Module): 102 | def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0): 103 | super(WideResNet, self).__init__() 104 | nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor] 105 | assert((depth - 4) % 6 == 0) 106 | n = (depth - 4) / 6 107 | block = BasicBlock 108 | # 1st conv before any network block 109 | self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1, 110 | padding=1, bias=False) 111 | # 1st block 112 | self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate) 113 | # 2nd block 114 | self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate) 115 | # 3rd block 116 | self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate) 117 | # global average pooling and classifier 118 | self.bn1 = nn.BatchNorm2d(nChannels[3]) 119 | self.relu = nn.LeakyReLU(0.1, inplace=True) 120 | self.fc = nn.Linear(nChannels[3], num_classes) 121 | self.nChannels = nChannels[3] 122 | 123 | for m in self.modules(): 124 | if isinstance(m, nn.Conv2d): 125 | n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels 126 | m.weight.data.normal_(0, math.sqrt(2. / n)) 127 | elif isinstance(m, nn.BatchNorm2d): 128 | m.weight.data.fill_(1) 129 | m.bias.data.zero_() 130 | elif isinstance(m, nn.Linear): 131 | m.bias.data.zero_() 132 | 133 | def forward(self, x, _eval=False): 134 | if _eval: 135 | # switch to eval mode 136 | self.eval() 137 | else: 138 | self.train() 139 | 140 | out = self.conv1(x) 141 | out = self.block1(out) 142 | out = self.block2(out) 143 | out = self.block3(out) 144 | out = self.relu(self.bn1(out)) 145 | out = F.avg_pool2d(out, 8) 146 | out = out.view(-1, self.nChannels) 147 | 148 | self.train() 149 | 150 | return self.fc(out) 151 | 152 | 153 | if __name__ == '__main__': 154 | i = torch.FloatTensor(4, 3, 32, 32) 155 | 156 | n = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 157 | 158 | # print(n(i).size()) 159 | 160 | print(count_parameters(n)) 161 | 162 | -------------------------------------------------------------------------------- /cifar-10/src/utils/__init__.py: -------------------------------------------------------------------------------- 1 | from .utils import * -------------------------------------------------------------------------------- /cifar-10/src/utils/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import logging 4 | 5 | import numpy as np 6 | 7 | import torch 8 | 9 | class LabelDict(): 10 | def __init__(self, dataset='cifar-10'): 11 | self.dataset = dataset 12 | if dataset == 'cifar-10': 13 | self.label_dict = {0: 'airplane', 1: 'automobile', 2: 'bird', 3: 'cat', 14 | 4: 'deer', 5: 'dog', 6: 'frog', 7: 'horse', 15 | 8: 'ship', 9: 'truck'} 16 | 17 | self.class_dict = {v: k for k, v in self.label_dict.items()} 18 | 19 | def label2class(self, label): 20 | assert label in self.label_dict, 'the label %d is not in %s' % (label, self.dataset) 21 | return self.label_dict[label] 22 | 23 | def class2label(self, _class): 24 | assert isinstance(_class, str) 25 | assert _class in self.class_dict, 'the class %s is not in %s' % (_class, self.dataset) 26 | return self.class_dict[_class] 27 | 28 | def list2cuda(_list): 29 | array = np.array(_list) 30 | return numpy2cuda(array) 31 | 32 | def numpy2cuda(array): 33 | tensor = torch.from_numpy(array) 34 | 35 | return tensor2cuda(tensor) 36 | 37 | def tensor2cuda(tensor): 38 | if torch.cuda.is_available(): 39 | tensor = tensor.cuda() 40 | 41 | return tensor 42 | 43 | def one_hot(ids, n_class): 44 | # --------------------- 45 | # author:ke1th 46 | # source:CSDN 47 | # artical:https://blog.csdn.net/u012436149/article/details/77017832 48 | b""" 49 | ids: (list, ndarray) shape:[batch_size] 50 | out_tensor:FloatTensor shape:[batch_size, depth] 51 | """ 52 | 53 | assert len(ids.shape) == 1, 'the ids should be 1-D' 54 | # ids = torch.LongTensor(ids).view(-1,1) 55 | 56 | out_tensor = torch.zeros(len(ids), n_class) 57 | 58 | out_tensor.scatter_(1, ids.cpu().unsqueeze(1), 1.) 59 | 60 | return out_tensor 61 | 62 | def evaluate(_input, _target, method='mean'): 63 | correct = (_input == _target).astype(np.float32) 64 | if method == 'mean': 65 | return correct.mean() 66 | else: 67 | return correct.sum() 68 | 69 | 70 | def create_logger(save_path='', file_type='', level='debug'): 71 | 72 | if level == 'debug': 73 | _level = logging.DEBUG 74 | elif level == 'info': 75 | _level = logging.INFO 76 | 77 | logger = logging.getLogger() 78 | logger.setLevel(_level) 79 | 80 | cs = logging.StreamHandler() 81 | cs.setLevel(_level) 82 | logger.addHandler(cs) 83 | 84 | if save_path != '': 85 | file_name = os.path.join(save_path, file_type + '_log.txt') 86 | fh = logging.FileHandler(file_name, mode='w') 87 | fh.setLevel(_level) 88 | 89 | logger.addHandler(fh) 90 | 91 | return logger 92 | 93 | def makedirs(path): 94 | if not os.path.exists(path): 95 | os.makedirs(path) 96 | 97 | def load_model(model, file_name): 98 | model.load_state_dict( 99 | torch.load(file_name, map_location=lambda storage, loc: storage)) 100 | 101 | def save_model(model, file_name): 102 | torch.save(model.state_dict(), file_name) 103 | 104 | def count_parameters(model): 105 | # copy from https://discuss.pytorch.org/t/how-do-i-check-the-number-of-parameters-of-a-model/4325/8 106 | # baldassarre.fe's reply 107 | return sum(p.numel() for p in model.parameters() if p.requires_grad) -------------------------------------------------------------------------------- /cifar-10/src/visualization/__init__.py: -------------------------------------------------------------------------------- 1 | from .vanilla_backprop import VanillaBackprop -------------------------------------------------------------------------------- /cifar-10/src/visualization/vanilla_backprop.py: -------------------------------------------------------------------------------- 1 | """ 2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-visualizations 3 | 4 | original author: Utku Ozbulak - github.com/utkuozbulak 5 | """ 6 | 7 | import sys 8 | sys.path.append("..") 9 | 10 | import torch 11 | 12 | from src.utils import tensor2cuda, one_hot 13 | 14 | class VanillaBackprop(): 15 | """ 16 | Produces gradients generated with vanilla back propagation from the image 17 | """ 18 | def __init__(self, model): 19 | self.model = model 20 | 21 | def generate_gradients(self, input_image, target_class): 22 | # Put model in evaluation mode 23 | self.model.eval() 24 | 25 | x = input_image.clone() 26 | 27 | x.requires_grad = True 28 | 29 | with torch.enable_grad(): 30 | # Forward 31 | model_output = self.model(x) 32 | # Zero grads 33 | self.model.zero_grad() 34 | 35 | grad_outputs = one_hot(target_class, model_output.shape[1]) 36 | grad_outputs = tensor2cuda(grad_outputs) 37 | 38 | grad = torch.autograd.grad(model_output, x, grad_outputs=grad_outputs, 39 | only_inputs=True)[0] 40 | 41 | self.model.train() 42 | 43 | return grad 44 | -------------------------------------------------------------------------------- /cifar-10/train.sh: -------------------------------------------------------------------------------- 1 | python -m src.main.py --data_root '.' --affix std 2 | python -m src.main.py --data_root '.' -e 0.0157 -p 'linf' --adv_train --affix 'linf' 3 | python -m src.main.py --data_root '.' -e 0.314 -p 'l2' --adv_train --affix 'l2' 4 | -------------------------------------------------------------------------------- /cifar-10/visualize.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import torch 4 | import torchvision as tv 5 | import numpy as np 6 | 7 | from torch.utils.data import DataLoader 8 | 9 | from src.utils import makedirs, tensor2cuda, load_model, LabelDict 10 | from argument import parser 11 | from src.visualization import VanillaBackprop 12 | from src.model.madry_model import WideResNet 13 | 14 | import matplotlib.pyplot as plt 15 | 16 | img_folder = 'img' 17 | makedirs(img_folder) 18 | out_num = 5 19 | 20 | 21 | args = parser() 22 | 23 | label_dict = LabelDict(args.dataset) 24 | 25 | te_dataset = tv.datasets.CIFAR10(args.data_root, 26 | train=False, 27 | transform=tv.transforms.ToTensor(), 28 | download=True) 29 | 30 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 31 | 32 | 33 | for data, label in te_loader: 34 | 35 | data, label = tensor2cuda(data), tensor2cuda(label) 36 | 37 | 38 | break 39 | 40 | 41 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 42 | 43 | load_model(model, args.load_checkpoint) 44 | 45 | if torch.cuda.is_available(): 46 | model.cuda() 47 | 48 | VBP = VanillaBackprop(model) 49 | 50 | grad = VBP.generate_gradients(data, label) 51 | 52 | grad_flat = grad.view(grad.shape[0], -1) 53 | mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3) 54 | std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3) 55 | 56 | mean = mean.repeat(1, 1, data.shape[2], data.shape[3]) 57 | std = std.repeat(1, 1, data.shape[2], data.shape[3]) 58 | 59 | grad = torch.max(torch.min(grad, mean+3*std), mean-3*std) 60 | 61 | print(grad.min(), grad.max()) 62 | 63 | grad -= grad.min() 64 | 65 | grad /= grad.max() 66 | 67 | grad = grad.cpu().numpy().squeeze() # (N, 28, 28) 68 | 69 | grad *= 255.0 70 | 71 | label = label.cpu().numpy() 72 | 73 | data = data.cpu().numpy().squeeze() 74 | 75 | data *= 255.0 76 | 77 | out_list = [data, grad] 78 | 79 | types = ['Original', 'Your Model'] 80 | 81 | fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num) 82 | 83 | axs = _axs 84 | 85 | for j, _type in enumerate(types): 86 | axs[j, 0].set_ylabel(_type) 87 | 88 | # if j == 0: 89 | # cmap = 'gray' 90 | # else: 91 | # cmap = 'seismic' 92 | 93 | for i in range(out_num): 94 | axs[j, i].set_xlabel('%s' % label_dict.label2class(label[i])) 95 | img = out_list[j][i] 96 | # print(img) 97 | img = np.transpose(img, (1, 2, 0)) 98 | 99 | img = img.astype(np.uint8) 100 | axs[j, i].imshow(img) 101 | 102 | axs[j, i].get_xaxis().set_ticks([]) 103 | axs[j, i].get_yaxis().set_ticks([]) 104 | 105 | plt.tight_layout() 106 | plt.savefig(os.path.join(img_folder, 'cifar_grad_%s.jpg' % args.affix)) 107 | 108 | # types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained'] 109 | 110 | 111 | # model_checkpoints = ['checkpoint/cifar-10_std/checkpoint_76000.pth', 112 | # 'checkpoint/cifar-10_linf/checkpoint_76000.pth', 113 | # 'checkpoint/cifar-10_l2/checkpoint_76000.pth'] 114 | 115 | 116 | # out_list = [] 117 | 118 | # for checkpoint in model_checkpoints: 119 | 120 | # model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 121 | 122 | # load_model(model, checkpoint) 123 | 124 | # if torch.cuda.is_available(): 125 | # model.cuda() 126 | 127 | # VBP = VanillaBackprop(model) 128 | 129 | # grad = VBP.generate_gradients(data, label) 130 | 131 | # grad_flat = grad.view(grad.shape[0], -1) 132 | # mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3) 133 | # std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3) 134 | 135 | # mean = mean.repeat(1, 1, data.shape[2], data.shape[3]) 136 | # std = std.repeat(1, 1, data.shape[2], data.shape[3]) 137 | 138 | # grad = torch.max(torch.min(grad, mean+3*std), mean-3*std) 139 | 140 | # print(grad.min(), grad.max()) 141 | 142 | # grad -= grad.min() 143 | 144 | # grad /= grad.max() 145 | 146 | # grad = grad.cpu().numpy().squeeze() # (N, 28, 28) 147 | 148 | # grad *= 255.0 149 | 150 | # out_list.append(grad) 151 | 152 | # data = data.cpu().numpy().squeeze() # (N, 28, 28) 153 | # data *= 255.0 154 | # label = label.cpu().numpy() 155 | 156 | # out_list.insert(0, data) 157 | 158 | # # normalize the grad 159 | # # length = torch.norm(grad, dim=3) 160 | # # length = torch.norm(length, dim=2) 161 | # # length = length.unsqueeze(2).unsqueeze(2) 162 | # # grad /= (length + 1e-5) 163 | 164 | # out_num = 5 165 | 166 | # fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num) 167 | 168 | # axs = _axs 169 | 170 | 171 | # for j, _type in enumerate(types): 172 | # axs[j, 0].set_ylabel(_type) 173 | 174 | # # if j == 0: 175 | # # cmap = 'gray' 176 | # # else: 177 | # # cmap = 'seismic' 178 | 179 | # for i in range(out_num): 180 | 181 | # data_id = i + 0 182 | 183 | # axs[j, i].set_xlabel('%s' % label_dict.label2class(label[data_id])) 184 | 185 | # img = out_list[j][data_id] 186 | # # print(img) 187 | # img = np.transpose(img, (1, 2, 0)) 188 | 189 | # img = img.astype(np.uint8) 190 | # axs[j, i].imshow(img) 191 | 192 | # axs[j, i].get_xaxis().set_ticks([]) 193 | # axs[j, i].get_yaxis().set_ticks([]) 194 | 195 | # plt.tight_layout() 196 | # plt.savefig(os.path.join(img_folder, 'cifar_grad_%s.jpg' % args.affix)) -------------------------------------------------------------------------------- /cifar-10/visualize_attack.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import torch 4 | import torchvision as tv 5 | import numpy as np 6 | 7 | from torch.utils.data import DataLoader 8 | 9 | from src.utils import makedirs, tensor2cuda, load_model, LabelDict 10 | from argument import parser 11 | from src.visualization import VanillaBackprop 12 | from src.attack import FastGradientSignUntargeted 13 | from src.model.madry_model import WideResNet 14 | 15 | import matplotlib.pyplot as plt 16 | 17 | max_epsilon = 4.7 18 | 19 | perturbation_type = 'l2' 20 | 21 | out_num = 5 22 | 23 | img_folder = 'img' 24 | makedirs(img_folder) 25 | 26 | args = parser() 27 | 28 | label_dict = LabelDict(args.dataset) 29 | 30 | te_dataset = tv.datasets.CIFAR10(args.data_root, 31 | train=False, 32 | transform=tv.transforms.ToTensor(), 33 | download=True) 34 | 35 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 36 | 37 | 38 | for data, label in te_loader: 39 | 40 | data, label = tensor2cuda(data), tensor2cuda(label) 41 | 42 | 43 | break 44 | 45 | 46 | adv_list = [] 47 | pred_list = [] 48 | 49 | with torch.no_grad(): 50 | 51 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 52 | 53 | load_model(model, args.load_checkpoint) 54 | 55 | if torch.cuda.is_available(): 56 | model.cuda() 57 | 58 | attack = FastGradientSignUntargeted(model, 59 | max_epsilon, 60 | args.alpha, 61 | min_val=0, 62 | max_val=1, 63 | max_iters=args.k, 64 | _type=perturbation_type) 65 | 66 | 67 | adv_data = attack.perturb(data, label, 'mean', False) 68 | 69 | output = model(adv_data, _eval=True) 70 | pred = torch.max(output, dim=1)[1] 71 | adv_list.append(adv_data.cpu().numpy().squeeze() * 255.0) # (N, 28, 28) 72 | pred_list.append(pred.cpu().numpy()) 73 | 74 | data = data.cpu().numpy().squeeze() # (N, 28, 28) 75 | data *= 255.0 76 | label = label.cpu().numpy() 77 | 78 | adv_list.insert(0, data) 79 | 80 | pred_list.insert(0, label) 81 | 82 | 83 | types = ['Original', 'Your Model'] 84 | 85 | fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num) 86 | 87 | axs = _axs 88 | 89 | for j, _type in enumerate(types): 90 | axs[j, 0].set_ylabel(_type) 91 | 92 | for i in range(out_num): 93 | axs[j, i].set_xlabel('%s' % label_dict.label2class(pred_list[j][i])) 94 | img = adv_list[j][i] 95 | # print(img) 96 | img = np.transpose(img, (1, 2, 0)) 97 | 98 | img = img.astype(np.uint8) 99 | axs[j, i].imshow(img) 100 | 101 | axs[j, i].get_xaxis().set_ticks([]) 102 | axs[j, i].get_yaxis().set_ticks([]) 103 | 104 | plt.tight_layout() 105 | plt.savefig(os.path.join(img_folder, 'cifar_large_%s_%s.jpg' % (perturbation_type, args.affix))) 106 | # plt.savefig(os.path.join(img_folder, 'test_%s.jpg' % (args.affix))) 107 | 108 | 109 | # types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained'] 110 | 111 | 112 | # model_checkpoints = ['checkpoint/cifar-10_std/checkpoint_76000.pth', 113 | # 'checkpoint/cifar-10_linf/checkpoint_76000.pth', 114 | # 'checkpoint/cifar-10_l2/checkpoint_76000.pth'] 115 | 116 | # adv_list = [] 117 | # pred_list = [] 118 | 119 | # max_epsilon = 4 120 | 121 | # perturbation_type = 'l2' 122 | 123 | # with torch.no_grad(): 124 | # for checkpoint in model_checkpoints: 125 | 126 | # model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0) 127 | 128 | # load_model(model, checkpoint) 129 | 130 | # if torch.cuda.is_available(): 131 | # model.cuda() 132 | 133 | # attack = FastGradientSignUntargeted(model, 134 | # max_epsilon, 135 | # args.alpha, 136 | # min_val=0, 137 | # max_val=1, 138 | # max_iters=args.k, 139 | # _type=perturbation_type) 140 | 141 | 142 | # adv_data = attack.perturb(data, label, 'mean', False) 143 | 144 | # output = model(adv_data, _eval=True) 145 | # pred = torch.max(output, dim=1)[1] 146 | # adv_list.append(adv_data.cpu().numpy().squeeze() * 255.0) # (N, 28, 28) 147 | # pred_list.append(pred.cpu().numpy()) 148 | 149 | # data = data.cpu().numpy().squeeze() # (N, 28, 28) 150 | # data *= 255.0 151 | # label = label.cpu().numpy() 152 | 153 | # adv_list.insert(0, data) 154 | 155 | # pred_list.insert(0, label) 156 | 157 | # out_num = 5 158 | 159 | # fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num) 160 | 161 | # axs = _axs 162 | 163 | # for j, _type in enumerate(types): 164 | # axs[j, 0].set_ylabel(_type) 165 | 166 | # for i in range(out_num): 167 | # axs[j, i].set_xlabel('%s' % label_dict.label2class(pred_list[j][i])) 168 | # img = adv_list[j][i] 169 | # # print(img) 170 | # img = np.transpose(img, (1, 2, 0)) 171 | 172 | # img = img.astype(np.uint8) 173 | # axs[j, i].imshow(img) 174 | 175 | # axs[j, i].get_xaxis().set_ticks([]) 176 | # axs[j, i].get_yaxis().set_ticks([]) 177 | 178 | # plt.tight_layout() 179 | # plt.savefig(os.path.join(img_folder, 'cifar_large_%s_%s.jpg' % (perturbation_type, args.affix))) -------------------------------------------------------------------------------- /mnist/README.md: -------------------------------------------------------------------------------- 1 | # Adversarial Training and Visualization on MNIST 2 | 3 | 4 | ## Results 5 | 6 | ### Learning Curves 7 | 8 | Epsilon in linf (l2) training is 0.3 (1.5). 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 |
Standard Training l_inf Training l_2 Training
Standard Accuracy
(train/test)
Robustness Accuracy
(train/test)
Standard Accuracy
(train/test)
Robustness Accuracy
(train/test)
Standard Accuracy
(train/test)
Robustness Accuracy
(train/test)
100.00/99.32 0.00/0.61 100.00/98.96 96.88/95.16 100.00/99.41 100.00/97.48
40 | 41 | Note that in testing mode, the target label used in creating the adversarial example is the most confident prediction of the model, not the ground truth. Therefore, sometimes the testing robustness is higher than training robustness, when the prediction is wrong at first. 42 | 43 | ### Visualization of Gradient with Respect to Input 44 | 45 | ![visualization](https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_grad_default.jpg) 46 | 47 | ### The Adversarial Example with large epsilon 48 | 49 | The maximum epsilon is set to 4 (l2 norm) in this part. 50 | 51 | ![large](https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_large_l2_.jpg) 52 | 53 | 54 | ## Requirements: 55 | ``` 56 | python >= 3.5 57 | torch == 1.0 58 | torchvision == 0.2.1 59 | numpy >= 1.16.1 60 | matplotlib >= 3.0.2 61 | ``` 62 | 63 | ## Execution 64 | 65 | ### Training 66 | 67 | Standard training:
68 | 69 | ``` 70 | python main.py --data_root [data directory] 71 | ``` 72 | 73 | linf training:
74 | 75 | ``` 76 | python main.py --data_root [data directory] -e 0.3 -p 'linf' --adv_train 77 | ``` 78 | 79 | l2 training:
80 | 81 | ``` 82 | python main.py --data_root [data directory] -e 1.5 -p 'l2' --adv_train 83 | ``` 84 | 85 | ### Testing 86 | 87 | change the setting if you want to do linf testing. 88 | ``` 89 | python main.py --todo test --data_root [data directory] -e 0.314 -p 'l2' --load_checkpoint [your_model.pth] 90 | ``` 91 | 92 | ### Visualization 93 | 94 | change the setting in `visualize.py` `visualize_attack.py` and if you want to do linf visualization. 95 | 96 | visualize gradient to input:
97 | 98 | ``` 99 | python visualize.py --load_checkpoint [your_model.pth] 100 | ``` 101 | 102 | visualize adversarial examples with larger epsilon
103 | 104 | ``` 105 | python visualize_attack.py --load_checkpoint [your_model.pth] 106 | ``` 107 | 108 | 109 | ## Training Time 110 | 111 | Standard training: 0.64 s / 100 iterations
112 | Adversarial training: 16 s / 100 iterations

113 | 114 | where the batch size is 64 and train on NVIDIA GeForce GTX 1080. 115 | -------------------------------------------------------------------------------- /mnist/img/mnist_grad_default.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_grad_default.jpg -------------------------------------------------------------------------------- /mnist/img/mnist_large_l2_.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_large_l2_.jpg -------------------------------------------------------------------------------- /mnist/img/mnist_learning_curve_l2.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_l2.jpg -------------------------------------------------------------------------------- /mnist/img/mnist_learning_curve_linf.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_linf.jpg -------------------------------------------------------------------------------- /mnist/img/mnist_learning_curve_std.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_std.jpg -------------------------------------------------------------------------------- /mnist/main.py: -------------------------------------------------------------------------------- 1 | import os 2 | import torch 3 | import torch.nn as nn 4 | import torch.nn.functional as F 5 | from torch.utils.data import DataLoader 6 | 7 | import torchvision as tv 8 | 9 | from time import time 10 | from model import Model 11 | from attack import FastGradientSignUntargeted 12 | from utils import makedirs, create_logger, tensor2cuda, numpy2cuda, evaluate, save_model 13 | 14 | from argument import parser, print_args 15 | 16 | class Trainer(): 17 | def __init__(self, args, logger, attack): 18 | self.args = args 19 | self.logger = logger 20 | self.attack = attack 21 | 22 | def standard_train(self, model, tr_loader, va_loader=None): 23 | self.train(model, tr_loader, va_loader, False) 24 | 25 | def adversarial_train(self, model, tr_loader, va_loader=None): 26 | self.train(model, tr_loader, va_loader, True) 27 | 28 | def train(self, model, tr_loader, va_loader=None, adv_train=False): 29 | args = self.args 30 | logger = self.logger 31 | 32 | opt = torch.optim.Adam(model.parameters(), args.learning_rate) 33 | 34 | _iter = 0 35 | 36 | begin_time = time() 37 | 38 | for epoch in range(1, args.max_epoch+1): 39 | for data, label in tr_loader: 40 | data, label = tensor2cuda(data), tensor2cuda(label) 41 | 42 | if adv_train: 43 | # When training, the adversarial example is created from a random 44 | # close point to the original data point. If in evaluation mode, 45 | # just start from the original data point. 46 | adv_data = self.attack.perturb(data, label, 'mean', True) 47 | output = model(adv_data, _eval=False) 48 | else: 49 | output = model(data, _eval=False) 50 | 51 | loss = F.cross_entropy(output, label) 52 | 53 | opt.zero_grad() 54 | loss.backward() 55 | opt.step() 56 | 57 | if _iter % args.n_eval_step == 0: 58 | 59 | if adv_train: 60 | with torch.no_grad(): 61 | stand_output = model(data, _eval=True) 62 | pred = torch.max(stand_output, dim=1)[1] 63 | 64 | # print(pred) 65 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 66 | 67 | pred = torch.max(output, dim=1)[1] 68 | # print(pred) 69 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 70 | 71 | else: 72 | adv_data = self.attack.perturb(data, label, 'mean', False) 73 | 74 | with torch.no_grad(): 75 | adv_output = model(adv_data, _eval=True) 76 | pred = torch.max(adv_output, dim=1)[1] 77 | # print(label) 78 | # print(pred) 79 | adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 80 | 81 | pred = torch.max(output, dim=1)[1] 82 | # print(pred) 83 | std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100 84 | 85 | # only calculating the training time 86 | logger.info('epoch: %d, iter: %d, spent %.2f s, tr_loss: %.3f' % ( 87 | epoch, _iter, time() - begin_time, loss.item())) 88 | 89 | logger.info('standard acc: %.3f %%, robustness acc: %.3f %%' % ( 90 | std_acc, adv_acc)) 91 | 92 | if va_loader is not None: 93 | va_acc, va_adv_acc = self.test(model, va_loader, True) 94 | va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0 95 | 96 | logger.info('\n' + '='*30 + ' evaluation ' + '='*30) 97 | logger.info('test acc: %.3f %%, test adv acc: %.3f %%' % ( 98 | va_acc, va_adv_acc)) 99 | logger.info('='*28 + ' end of evaluation ' + '='*28 + '\n') 100 | 101 | begin_time = time() 102 | 103 | if _iter % args.n_store_image_step == 0: 104 | tv.utils.save_image(torch.cat([data.cpu(), adv_data.cpu()], dim=0), 105 | os.path.join(args.log_folder, 'images_%d.jpg' % _iter), 106 | nrow=16) 107 | 108 | 109 | if _iter % args.n_checkpoint_step == 0: 110 | file_name = os.path.join(args.model_folder, 'checkpoint_%d.pth' % _iter) 111 | save_model(model, file_name) 112 | 113 | _iter += 1 114 | 115 | def test(self, model, loader, adv_test=False): 116 | # adv_test is False, return adv_acc as -1 117 | 118 | total_acc = 0.0 119 | num = 0 120 | total_adv_acc = 0.0 121 | 122 | with torch.no_grad(): 123 | for data, label in loader: 124 | data, label = tensor2cuda(data), tensor2cuda(label) 125 | 126 | output = model(data, _eval=True) 127 | 128 | pred = torch.max(output, dim=1)[1] 129 | te_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy(), 'sum') 130 | 131 | total_acc += te_acc 132 | num += output.shape[0] 133 | 134 | if adv_test: 135 | # use predicted label as target label 136 | # with torch.enable_grad(): 137 | adv_data = self.attack.perturb(data, pred, 'mean', False) 138 | 139 | adv_output = model(adv_data, _eval=True) 140 | 141 | adv_pred = torch.max(adv_output, dim=1)[1] 142 | adv_acc = evaluate(adv_pred.cpu().numpy(), label.cpu().numpy(), 'sum') 143 | total_adv_acc += adv_acc 144 | else: 145 | total_adv_acc = -num 146 | 147 | return total_acc / num , total_adv_acc / num 148 | 149 | def main(args): 150 | 151 | save_folder = '%s_%s' % (args.dataset, args.affix) 152 | 153 | log_folder = os.path.join(args.log_root, save_folder) 154 | model_folder = os.path.join(args.model_root, save_folder) 155 | 156 | makedirs(log_folder) 157 | makedirs(model_folder) 158 | 159 | setattr(args, 'log_folder', log_folder) 160 | setattr(args, 'model_folder', model_folder) 161 | 162 | logger = create_logger(log_folder, args.todo, 'info') 163 | 164 | print_args(args, logger) 165 | 166 | model = Model(i_c=1, n_c=10) 167 | 168 | attack = FastGradientSignUntargeted(model, 169 | args.epsilon, 170 | args.alpha, 171 | min_val=0, 172 | max_val=1, 173 | max_iters=args.k, 174 | _type=args.perturbation_type) 175 | 176 | if torch.cuda.is_available(): 177 | model.cuda() 178 | 179 | trainer = Trainer(args, logger, attack) 180 | 181 | if args.todo == 'train': 182 | tr_dataset = tv.datasets.MNIST(args.data_root, 183 | train=True, 184 | transform=tv.transforms.ToTensor(), 185 | download=True) 186 | 187 | tr_loader = DataLoader(tr_dataset, batch_size=args.batch_size, shuffle=True, num_workers=4) 188 | 189 | # evaluation during training 190 | te_dataset = tv.datasets.MNIST(args.data_root, 191 | train=False, 192 | transform=tv.transforms.ToTensor(), 193 | download=True) 194 | 195 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 196 | 197 | trainer.train(model, tr_loader, te_loader, args.adv_train) 198 | elif args.todo == 'test': 199 | pass 200 | else: 201 | raise NotImplementedError 202 | 203 | 204 | 205 | 206 | if __name__ == '__main__': 207 | args = parser() 208 | 209 | os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu 210 | 211 | main(args) -------------------------------------------------------------------------------- /mnist/src/argument.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | 3 | def parser(): 4 | parser = argparse.ArgumentParser(description='Video Summarization') 5 | parser.add_argument('--todo', choices=['train', 'valid', 'test', 'visualize'], default='train', 6 | help='what behavior want to do: train | valid | test | visualize') 7 | parser.add_argument('--dataset', default='mnist', help='use what dataset') 8 | parser.add_argument('--data_root', default='/home/yilin/Data', 9 | help='the directory to save the dataset') 10 | parser.add_argument('--log_root', default='log', 11 | help='the directory to save the logs or other imformations (e.g. images)') 12 | parser.add_argument('--model_root', default='checkpoint', help='the directory to save the models') 13 | parser.add_argument('--load_checkpoint', default='./model/default/model.pth') 14 | parser.add_argument('--affix', default='', help='the affix for the save folder') 15 | 16 | # parameters for generating adversarial examples 17 | parser.add_argument('--epsilon', '-e', type=float, default=0.3, 18 | help='maximum perturbation of adversaries') 19 | parser.add_argument('--alpha', '-a', type=float, default=0.01, 20 | help='movement multiplier per iteration when generating adversarial examples') 21 | parser.add_argument('--k', '-k', type=int, default=40, 22 | help='maximum iteration when generating adversarial examples') 23 | 24 | 25 | 26 | parser.add_argument('--batch_size', '-b', type=int, default=64, help='batch size') 27 | parser.add_argument('--max_epoch', '-m_e', type=int, default=60, 28 | help='the maximum numbers of the model see a sample') 29 | parser.add_argument('--learning_rate', '-lr', type=float, default=1e-4, help='learning rate') 30 | 31 | parser.add_argument('--gpu', '-g', default='0', help='which gpu to use') 32 | parser.add_argument('--n_eval_step', type=int, default=100, 33 | help='number of iteration per one evaluation') 34 | parser.add_argument('--n_checkpoint_step', type=int, default=2000, 35 | help='number of iteration to save a checkpoint') 36 | parser.add_argument('--n_store_image_step', type=int, default=2000, 37 | help='number of iteration to save adversaries') 38 | parser.add_argument('--perturbation_type', '-p', choices=['linf', 'l2'], default='linf', 39 | help='the type of the perturbation (linf or l2)') 40 | 41 | parser.add_argument('--adv_train', action='store_true') 42 | 43 | return parser.parse_args() 44 | 45 | def print_args(args, logger=None): 46 | for k, v in vars(args).items(): 47 | if logger is not None: 48 | logger.info('{:<16} : {}'.format(k, v)) 49 | else: 50 | print('{:<16} : {}'.format(k, v)) -------------------------------------------------------------------------------- /mnist/src/attack/__init__.py: -------------------------------------------------------------------------------- 1 | from .fast_gradient_sign_untargeted import FastGradientSignUntargeted -------------------------------------------------------------------------------- /mnist/src/attack/fast_gradient_sign_untargeted.py: -------------------------------------------------------------------------------- 1 | """ 2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks 3 | 4 | original author: Utku Ozbulak - github.com/utkuozbulak 5 | """ 6 | import sys 7 | sys.path.append("..") 8 | 9 | import os 10 | import numpy as np 11 | 12 | import torch 13 | from torch import nn 14 | import torch.nn.functional as F 15 | 16 | from utils import tensor2cuda 17 | 18 | def project(x, original_x, epsilon, _type='linf'): 19 | 20 | if _type == 'linf': 21 | max_x = original_x + epsilon 22 | min_x = original_x - epsilon 23 | 24 | x = torch.max(torch.min(x, max_x), min_x) 25 | 26 | elif _type == 'l2': 27 | dist = (x - original_x) 28 | 29 | dist = dist.view(x.shape[0], -1) 30 | 31 | dist_norm = torch.norm(dist, dim=1, keepdim=True) 32 | 33 | mask = (dist_norm > epsilon).unsqueeze(2).unsqueeze(3) 34 | 35 | # dist = F.normalize(dist, p=2, dim=1) 36 | 37 | dist = dist / dist_norm 38 | 39 | dist *= epsilon 40 | 41 | dist = dist.view(x.shape) 42 | 43 | x = (original_x + dist) * mask.float() + x * (1 - mask.float()) 44 | 45 | else: 46 | raise NotImplementedError 47 | 48 | return x 49 | 50 | class FastGradientSignUntargeted(): 51 | b""" 52 | Fast gradient sign untargeted adversarial attack, minimizes the initial class activation 53 | with iterative grad sign updates 54 | """ 55 | def __init__(self, model, epsilon, alpha, min_val, max_val, max_iters, _type='linf'): 56 | self.model = model 57 | # self.model.eval() 58 | 59 | # Maximum perturbation 60 | self.epsilon = epsilon 61 | # Movement multiplier per iteration 62 | self.alpha = alpha 63 | # Minimum value of the pixels 64 | self.min_val = min_val 65 | # Maximum value of the pixels 66 | self.max_val = max_val 67 | # Maximum numbers of iteration to generated adversaries 68 | self.max_iters = max_iters 69 | # The perturbation of epsilon 70 | self._type = _type 71 | 72 | def perturb(self, original_images, labels, reduction4loss='mean', random_start=False): 73 | # original_images: values are within self.min_val and self.max_val 74 | 75 | # The adversaries created from random close points to the original data 76 | if random_start: 77 | rand_perturb = torch.FloatTensor(original_images.shape).uniform_( 78 | -self.epsilon, self.epsilon) 79 | rand_perturb = tensor2cuda(rand_perturb) 80 | x = original_images + rand_perturb 81 | x.clamp_(self.min_val, self.max_val) 82 | else: 83 | x = original_images.clone() 84 | 85 | x.requires_grad = True 86 | 87 | # max_x = original_images + self.epsilon 88 | # min_x = original_images - self.epsilon 89 | 90 | with torch.enable_grad(): 91 | for _iter in range(self.max_iters): 92 | outputs = self.model(x, _eval=True) 93 | 94 | loss = F.cross_entropy(outputs, labels, reduction=reduction4loss) 95 | 96 | if reduction4loss == 'none': 97 | grad_outputs = tensor2cuda(torch.ones(loss.shape)) 98 | 99 | else: 100 | grad_outputs = None 101 | 102 | grads = torch.autograd.grad(loss, x, grad_outputs=grad_outputs, 103 | only_inputs=True)[0] 104 | 105 | x.data += self.alpha * torch.sign(grads.data) 106 | 107 | # the adversaries' pixel value should within max_x and min_x due 108 | # to the l_infinity / l2 restriction 109 | x = project(x, original_images, self.epsilon, self._type) 110 | # the adversaries' value should be valid pixel value 111 | x.clamp_(self.min_val, self.max_val) 112 | 113 | return x 114 | -------------------------------------------------------------------------------- /mnist/src/model/__init__.py: -------------------------------------------------------------------------------- 1 | from .model import * -------------------------------------------------------------------------------- /mnist/src/model/model.py: -------------------------------------------------------------------------------- 1 | import torch 2 | import torch.nn as nn 3 | 4 | class Expression(nn.Module): 5 | def __init__(self, func): 6 | super(Expression, self).__init__() 7 | self.func = func 8 | 9 | def forward(self, input): 10 | return self.func(input) 11 | 12 | class Model(nn.Module): 13 | def __init__(self, i_c=1, n_c=10): 14 | super(Model, self).__init__() 15 | 16 | self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True) 17 | self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0) 18 | 19 | self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True) 20 | self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0) 21 | 22 | 23 | self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1)) 24 | self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True) 25 | self.fc2 = nn.Linear(1024, n_c) 26 | 27 | 28 | def forward(self, x_i, _eval=False): 29 | 30 | if _eval: 31 | # switch to eval mode 32 | self.eval() 33 | else: 34 | self.train() 35 | 36 | x_o = self.conv1(x_i) 37 | x_o = torch.relu(x_o) 38 | x_o = self.pool1(x_o) 39 | 40 | x_o = self.conv2(x_o) 41 | x_o = torch.relu(x_o) 42 | x_o = self.pool2(x_o) 43 | 44 | x_o = self.flatten(x_o) 45 | 46 | x_o = torch.relu(self.fc1(x_o)) 47 | 48 | self.train() 49 | 50 | return self.fc2(x_o) 51 | 52 | 53 | if __name__ == '__main__': 54 | i = torch.FloatTensor(4, 1, 28, 28) 55 | 56 | n = Model() 57 | 58 | print(n(i).size()) 59 | 60 | -------------------------------------------------------------------------------- /mnist/src/read_log.py: -------------------------------------------------------------------------------- 1 | import re 2 | import os 3 | from utils import makedirs 4 | import matplotlib.pyplot as plt 5 | 6 | file_name = '../log/mnist_l2_adv/train_log.txt' 7 | affix = 'l2' 8 | title = r'$l_2$ Training' 9 | 10 | img_folder = '../img' 11 | makedirs(img_folder) 12 | 13 | train_iter_list = [] 14 | train_acc_list = [] 15 | train_rob_list = [] 16 | 17 | test_iter_list = [] 18 | test_acc_list = [] 19 | test_rob_list = [] 20 | 21 | with open(file_name, 'r') as f: 22 | lines = f.readlines() 23 | 24 | for line in lines: 25 | splits = re.split('[, =%:\n]+', line) 26 | 27 | if splits[0] == 'epoch': 28 | _iter = int(splits[3]) 29 | train_iter_list.append(_iter) 30 | 31 | if splits[0] == 'standard': 32 | train_acc_list.append(float(splits[2])) 33 | train_rob_list.append(float(splits[5])) 34 | 35 | if splits[0] == 'test': 36 | test_iter_list.append(_iter) 37 | test_acc_list.append(float(splits[2])) 38 | test_rob_list.append(float(splits[6])) 39 | 40 | 41 | a_1 = plt.plot(train_iter_list, train_acc_list , color='r', label='train standard accuary')[0] 42 | a_2 = plt.plot(test_iter_list, test_acc_list , color='r', linestyle='--', label='test standard accuary')[0] 43 | 44 | b_1 = plt.plot(train_iter_list, train_rob_list , color='b', label='train robust accuary')[0] 45 | b_2 = plt.plot(test_iter_list, test_rob_list , color='b', linestyle='--', label='test robust accuary')[0] 46 | 47 | plt.title(title) 48 | 49 | plt.legend(handles=[a_1, a_2, b_1, b_2]) 50 | 51 | plt.savefig(os.path.join(img_folder, 'mnist_learning_curve_%s.jpg' % affix)) -------------------------------------------------------------------------------- /mnist/src/utils/__init__.py: -------------------------------------------------------------------------------- 1 | from .utils import * -------------------------------------------------------------------------------- /mnist/src/utils/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import json 3 | import logging 4 | 5 | import numpy as np 6 | 7 | import torch 8 | 9 | 10 | def list2cuda(_list): 11 | array = np.array(_list) 12 | return numpy2cuda(array) 13 | 14 | def numpy2cuda(array): 15 | tensor = torch.from_numpy(array) 16 | 17 | return tensor2cuda(tensor) 18 | 19 | def tensor2cuda(tensor): 20 | if torch.cuda.is_available(): 21 | tensor = tensor.cuda() 22 | 23 | return tensor 24 | 25 | def one_hot(ids, n_class): 26 | # --------------------- 27 | # author:ke1th 28 | # source:CSDN 29 | # artical:https://blog.csdn.net/u012436149/article/details/77017832 30 | b""" 31 | ids: (list, ndarray) shape:[batch_size] 32 | out_tensor:FloatTensor shape:[batch_size, depth] 33 | """ 34 | 35 | assert len(ids.shape) == 1, 'the ids should be 1-D' 36 | # ids = torch.LongTensor(ids).view(-1,1) 37 | 38 | out_tensor = torch.zeros(len(ids), n_class) 39 | 40 | out_tensor.scatter_(1, ids.cpu().unsqueeze(1), 1.) 41 | 42 | return out_tensor 43 | 44 | def evaluate(_input, _target, method='mean'): 45 | correct = (_input == _target).astype(np.float32) 46 | if method == 'mean': 47 | return correct.mean() 48 | else: 49 | return correct.sum() 50 | 51 | 52 | def create_logger(save_path='', file_type='', level='debug'): 53 | 54 | if level == 'debug': 55 | _level = logging.DEBUG 56 | elif level == 'info': 57 | _level = logging.INFO 58 | 59 | logger = logging.getLogger() 60 | logger.setLevel(_level) 61 | 62 | cs = logging.StreamHandler() 63 | cs.setLevel(_level) 64 | logger.addHandler(cs) 65 | 66 | if save_path != '': 67 | file_name = os.path.join(save_path, file_type + '_log.txt') 68 | fh = logging.FileHandler(file_name, mode='w') 69 | fh.setLevel(_level) 70 | 71 | logger.addHandler(fh) 72 | 73 | return logger 74 | 75 | def makedirs(path): 76 | if not os.path.exists(path): 77 | os.makedirs(path) 78 | 79 | def load_model(model, file_name): 80 | model.load_state_dict( 81 | torch.load(file_name, map_location=lambda storage, loc: storage)) 82 | 83 | def save_model(model, file_name): 84 | torch.save(model.state_dict(), file_name) -------------------------------------------------------------------------------- /mnist/src/visualization/__init__.py: -------------------------------------------------------------------------------- 1 | from .vanilla_backprop import VanillaBackprop -------------------------------------------------------------------------------- /mnist/src/visualization/vanilla_backprop.py: -------------------------------------------------------------------------------- 1 | """ 2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks 3 | 4 | original author: Utku Ozbulak - github.com/utkuozbulak 5 | """ 6 | 7 | import sys 8 | sys.path.append("..") 9 | 10 | import torch 11 | 12 | from utils import tensor2cuda, one_hot 13 | 14 | class VanillaBackprop(): 15 | """ 16 | Produces gradients generated with vanilla back propagation from the image 17 | """ 18 | def __init__(self, model): 19 | self.model = model 20 | 21 | def generate_gradients(self, input_image, target_class): 22 | # Put model in evaluation mode 23 | self.model.eval() 24 | 25 | x = input_image.clone() 26 | 27 | x.requires_grad = True 28 | 29 | # Forward 30 | model_output = self.model(x) 31 | # Zero grads 32 | self.model.zero_grad() 33 | 34 | grad_outputs = one_hot(target_class, model_output.shape[1]) 35 | grad_outputs = tensor2cuda(grad_outputs) 36 | 37 | grad = torch.autograd.grad(model_output, x, grad_outputs=grad_outputs, 38 | only_inputs=True)[0] 39 | 40 | self.model.train() 41 | 42 | return grad 43 | -------------------------------------------------------------------------------- /mnist/visualize.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import torch 4 | import torchvision as tv 5 | import numpy as np 6 | 7 | from torch.utils.data import DataLoader 8 | 9 | from utils import makedirs, tensor2cuda, load_model 10 | from argument import parser 11 | from visualization import VanillaBackprop 12 | from model import Model 13 | 14 | import matplotlib.pyplot as plt 15 | 16 | img_folder = '../img' 17 | makedirs(img_folder) 18 | 19 | args = parser() 20 | 21 | 22 | te_dataset = tv.datasets.MNIST(args.data_root, 23 | train=False, 24 | transform=tv.transforms.ToTensor(), 25 | download=True) 26 | 27 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 28 | 29 | 30 | for data, label in te_loader: 31 | 32 | data, label = tensor2cuda(data), tensor2cuda(label) 33 | 34 | 35 | break 36 | 37 | types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained'] 38 | 39 | 40 | model_checkpoints = ['../checkpoint/mnist_std_train/checkpoint_56000.pth', 41 | '../checkpoint/mnist_adv_train/checkpoint_56000.pth', 42 | '../checkpoint/mnist_l2_adv/checkpoint_56000.pth'] 43 | 44 | 45 | out_list = [] 46 | 47 | for checkpoint in model_checkpoints: 48 | 49 | model = Model(i_c=1, n_c=10) 50 | 51 | load_model(model, checkpoint) 52 | 53 | if torch.cuda.is_available(): 54 | model.cuda() 55 | 56 | VBP = VanillaBackprop(model) 57 | 58 | grad = VBP.generate_gradients(data, label) 59 | 60 | grad_flat = grad.view(grad.shape[0], -1) 61 | mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3) 62 | std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3) 63 | 64 | mean = mean.repeat(1, 1, data.shape[2], data.shape[3]) 65 | std = std.repeat(1, 1, data.shape[2], data.shape[3]) 66 | 67 | grad = torch.max(torch.min(grad, mean+3*std), mean-3*std) 68 | 69 | print(grad.min(), grad.max()) 70 | 71 | grad -= grad.min() 72 | 73 | grad /= grad.max() 74 | 75 | grad = grad.cpu().numpy().squeeze() # (N, 28, 28) 76 | 77 | grad *= 255.0 78 | 79 | out_list.append(grad) 80 | 81 | data = data.cpu().numpy().squeeze() # (N, 28, 28) 82 | data *= 255.0 83 | label = label.cpu().numpy() 84 | 85 | out_list.insert(0, data) 86 | 87 | # normalize the grad 88 | # length = torch.norm(grad, dim=3) 89 | # length = torch.norm(length, dim=2) 90 | # length = length.unsqueeze(2).unsqueeze(2) 91 | # grad /= (length + 1e-5) 92 | 93 | out_num = 5 94 | 95 | fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num) 96 | 97 | axs = _axs 98 | 99 | 100 | for j, _type in enumerate(types): 101 | axs[j, 0].set_ylabel(_type) 102 | 103 | if j == 0: 104 | cmap = 'gray' 105 | else: 106 | cmap = 'seismic' 107 | 108 | for i in range(out_num): 109 | axs[j, i].set_xlabel('%d' % label[i]) 110 | axs[j, i].imshow(out_list[j][i], cmap=cmap) 111 | 112 | axs[j, i].get_xaxis().set_ticks([]) 113 | axs[j, i].get_yaxis().set_ticks([]) 114 | 115 | plt.tight_layout() 116 | plt.savefig(os.path.join(img_folder, 'mnist_grad_%s.jpg' % args.affix)) -------------------------------------------------------------------------------- /mnist/visualize_attack.py: -------------------------------------------------------------------------------- 1 | 2 | import os 3 | import torch 4 | import torchvision as tv 5 | import numpy as np 6 | 7 | from torch.utils.data import DataLoader 8 | 9 | from utils import makedirs, tensor2cuda, load_model 10 | from argument import parser 11 | from visualization import VanillaBackprop 12 | from attack import FastGradientSignUntargeted 13 | from model import Model 14 | 15 | import matplotlib.pyplot as plt 16 | 17 | img_folder = '../img' 18 | makedirs(img_folder) 19 | 20 | args = parser() 21 | 22 | 23 | te_dataset = tv.datasets.MNIST(args.data_root, 24 | train=False, 25 | transform=tv.transforms.ToTensor(), 26 | download=True) 27 | 28 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4) 29 | 30 | 31 | for data, label in te_loader: 32 | 33 | data, label = tensor2cuda(data), tensor2cuda(label) 34 | 35 | 36 | break 37 | 38 | types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained'] 39 | 40 | 41 | model_checkpoints = ['../checkpoint/mnist_std_train/checkpoint_56000.pth', 42 | '../checkpoint/mnist_adv_train/checkpoint_56000.pth', 43 | '../checkpoint/mnist_l2_adv/checkpoint_56000.pth'] 44 | 45 | adv_list = [] 46 | pred_list = [] 47 | 48 | max_epsilon = 0.8 49 | 50 | perturbation_type = 'linf' 51 | 52 | with torch.no_grad(): 53 | for checkpoint in model_checkpoints: 54 | 55 | model = Model(i_c=1, n_c=10) 56 | 57 | load_model(model, checkpoint) 58 | 59 | if torch.cuda.is_available(): 60 | model.cuda() 61 | 62 | attack = FastGradientSignUntargeted(model, 63 | max_epsilon, 64 | args.alpha, 65 | min_val=0, 66 | max_val=1, 67 | max_iters=args.k, 68 | _type=perturbation_type) 69 | 70 | 71 | adv_data = attack.perturb(data, label, 'mean', False) 72 | 73 | output = model(adv_data, _eval=True) 74 | pred = torch.max(output, dim=1)[1] 75 | adv_list.append(adv_data.cpu().numpy().squeeze()) # (N, 28, 28) 76 | pred_list.append(pred.cpu().numpy()) 77 | 78 | data = data.cpu().numpy().squeeze() # (N, 28, 28) 79 | data *= 255.0 80 | label = label.cpu().numpy() 81 | 82 | adv_list.insert(0, data) 83 | 84 | pred_list.insert(0, label) 85 | 86 | out_num = 5 87 | 88 | fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num) 89 | 90 | axs = _axs 91 | 92 | cmap = 'gray' 93 | for j, _type in enumerate(types): 94 | axs[j, 0].set_ylabel(_type) 95 | 96 | for i in range(out_num): 97 | axs[j, i].set_xlabel('%d' % pred_list[j][i]) 98 | axs[j, i].imshow(adv_list[j][i], cmap=cmap) 99 | 100 | axs[j, i].get_xaxis().set_ticks([]) 101 | axs[j, i].get_yaxis().set_ticks([]) 102 | 103 | plt.tight_layout() 104 | plt.savefig(os.path.join(img_folder, 'mnist_large_%s_%s.jpg' % (perturbation_type, args.affix))) --------------------------------------------------------------------------------