├── .gitignore
├── README.md
├── cifar-10
    ├── README.md
    ├── img
    │   ├── cifar_grad_default.jpg
    │   ├── cifar_large_l2_default.jpg
    │   ├── cifar_large_linf_default.jpg
    │   ├── cifar_learning_curve_l2.jpg
    │   ├── cifar_learning_curve_linf.jpg
    │   └── cifar_learning_curve_std.jpg
    ├── main.py
    ├── src
    │   ├── argument.py
    │   ├── attack
    │   │   ├── __init__.py
    │   │   └── fast_gradient_sign_untargeted.py
    │   ├── model
    │   │   ├── __init__.py
    │   │   ├── madry_model.py
    │   │   └── model.py
    │   ├── utils
    │   │   ├── __init__.py
    │   │   └── utils.py
    │   └── visualization
    │   │   ├── __init__.py
    │   │   └── vanilla_backprop.py
    ├── train.sh
    ├── visualize.py
    └── visualize_attack.py
└── mnist
    ├── README.md
    ├── img
        ├── mnist_grad_default.jpg
        ├── mnist_large_l2_.jpg
        ├── mnist_learning_curve_l2.jpg
        ├── mnist_learning_curve_linf.jpg
        └── mnist_learning_curve_std.jpg
    ├── main.py
    ├── src
        ├── argument.py
        ├── attack
        │   ├── __init__.py
        │   └── fast_gradient_sign_untargeted.py
        ├── model
        │   ├── __init__.py
        │   └── model.py
        ├── read_log.py
        ├── utils
        │   ├── __init__.py
        │   └── utils.py
        └── visualization
        │   ├── __init__.py
        │   └── vanilla_backprop.py
    ├── visualize.py
    └── visualize_attack.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | __pycache__/
 2 | log/
 3 | 
 4 | # Compiled source #
 5 | ###################
 6 | *.com
 7 | *.class
 8 | *.dll
 9 | *.exe
10 | *.o
11 | *.so
12 | sftp-config.json
13 | read_log.py
14 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Adversarial Training and Visualization
 2 | 
 3 | The repo is the PyTorch-1.0 implementation for the adversarial training on MNIST/CIFAR-10. And I also reproduce part of the visualization results in [1]. <br/><br/>
 4 | 
 5 | **Note**: Not an official implementation.
 6 | 
 7 | ## Adversarial Training
 8 | 
 9 | <table align="center">
10 |     <tbody> 
11 |     <tr> 
12 |         <th colspan="2"> Objective Function  </th>
13 |     </tr>
14 |     <tr>        
15 |         <td width="50%" align="center"> Standard Training </td>
16 |         <td width="50%" align="center"> Adversarial Training </td>
17 |     </tr>
18 |     <tr>
19 |         <td width="50%" align="center"> <img src="https://latex.codecogs.com/gif.latex?\min&space;\textrm{E}_{(x,&space;y)&space;\in&space;Dataset}[L(x,&space;y;&space;\theta))]" title="\min \textrm{E}_{(x, y) \in Dataset}[L(x, y; \theta))]"> </td>
20 |         <td width="50%" align="center"> <img src="https://latex.codecogs.com/gif.latex?\min&space;\textrm{E}_{(x,&space;y)&space;\in&space;Dataset}[\max_{{\left&space;\|&space;\delta&space;\right&space;\|}_p&space;<&space;\epsilon}&space;L(x&plus;\delta,&space;y;&space;\theta))]" title="\min \textrm{E}_{(x, y) \in Dataset}[\max_{{\left \| \delta \right \|}_p < \epsilon} L(x+\delta, y; \theta))]"> </td>
21 |     </tr>
22 |     </tbody>
23 | </table>
24 | 
25 | where p in the table is usually 2 or inf. <br/><br/>
26 | 
27 | The objective of standard and adversarial training is fundamentally different. In standard training, the classifier minimize the loss computed from the original training data, while in adversarial training, it trains with the worst-case around the original data.
28 | 
29 | ## Visualization
30 | 
31 | In [1], the authors discover that the features learned by the robustness classifier are more human-perceivable. Related results are shown in mnist/cifar-10 folder.
32 | 
33 | ## Implementation
34 | 
35 | Part of the codes in this repo are borrowed/modified from [2], [3], [4] and [5].
36 | 
37 | ## References:
38 | 
39 | [1] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, A. Madry. *Robustness May Be at Odds with Accuracy*, https://arxiv.org/abs/1805.12152
40 | 
41 | [2] https://github.com/MadryLab/mnist_challenge
42 | 
43 | [3] https://github.com/MadryLab/cifar10_challenge
44 | 
45 | [4] https://github.com/xternalz/WideResNet-pytorch
46 | 
47 | [5] https://github.com/utkuozbulak/pytorch-cnn-visualizations
48 | 
49 | 
50 | ## Contact 
51 | Yi-Lin Sung, corumlouis123@gmail.com
52 | 


--------------------------------------------------------------------------------
/cifar-10/README.md:
--------------------------------------------------------------------------------
  1 | # Adversarial Training and Visualization on CIFAR-10
  2 | 
  3 | 
  4 | ## Update
  5 | * (2020/8/27)
  6 | 1. To match the implementation of [madry_cifar10](https://github.com/MadryLab/cifar10_challenge), we update the default learning rate to `0.1`, the activation function of model to `LeakyReLU(0.1)`, and the optimizer change to `torch.SGD`.
  7 | 2. Add new experiment in *Quantitative Results*, which match the results in [madry_cifar10](https://github.com/MadryLab/cifar10_challenge).
  8 | 3. Add checkpoints for the updated model and delete the old ones.
  9 | 4. Update codes structure. Pull `main.py`, `visualize.py` and `visualize_attack.py` out of `src` folder.
 10 | * (2019/12/14) 
 11 | 1. Add `madry_model.py`, which contains the same model used in [madry_cifar10](https://github.com/MadryLab/cifar10_challenge), in `src/model`. 
 12 | 2. Add `count_parameters(model)` in `model.py` and `madry_model.py`, and it can compute the number of all the trainable parameters.
 13 | 3. Flag `use_pseudo_label`, which determine whether to use model's prediction as the target, is added in to `trainer.test()`, and the default value is `False`.
 14 | 4. Update the "Experiments" and the "Execution" part in this Readme. 
 15 | 5. Checkpoints of adversarial training in cifar-10 are provided.
 16 | * (2019/4/18) Change the default alpha from 2 to 2/255, and update the results.
 17 | 
 18 | ## Results
 19 | 
 20 | Note that the experiments only conduct 1 time.
 21 | 
 22 | ### Learning Curves
 23 | 
 24 | Epsilon in linf (l2) training is 0.0157 (0.314). [0.0157=4/255, 0.314=80/255]
 25 | 
 26 | <table border=0 width="50px" >
 27 |     <tbody> 
 28 |     <tr>    
 29 |         <th colspan="2" align="center"> <strong>Standard Training</strong> </th>
 30 |         <th colspan="2" align="center"> <strong>l_inf Training</strong> </th>
 31 |         <th colspan="2" align="center"> <strong>l_2 Training</strong></th>
 32 |     </tr>
 33 |     <tr>
 34 |         <th colspan="2" align="center"> <img src="https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_learning_curve_std.jpg"> </th>
 35 |         <th colspan="2" align="center"> <img src="https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_learning_curve_linf.jpg"> </th>
 36 |         <th colspan="2" align="center"> <img src="https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_learning_curve_l2.jpg"> </th>
 37 |     </tr>
 38 |     <tr>
 39 |         <th colspan="1" align="center"> <strong>Standard Accuracy</strong> <br/> (train/test) </th>
 40 |         <th colspan="1" align="center"> <strong>Robustness Accuracy</strong> <br/> (train/test) </th>
 41 |         <th colspan="1" align="center"> <strong>Standard Accuracy</strong> <br/> (train/test) </th>
 42 |         <th colspan="1" align="center"> <strong>Robustness Accuracy</strong> <br/> (train/test) </th>
 43 |         <th colspan="1" align="center"> <strong>Standard Accuracy</strong> <br/> (train/test) </th>
 44 |         <th colspan="1" align="center"> <strong>Robustness Accuracy</strong> <br/> (train/test) </th>
 45 |     </tr>
 46 |     <tr>
 47 |         <th colspan="1" align="center"> 92.19/87.14 </th>
 48 |         <th colspan="1" align="center"> 0.00/7.85 </th>
 49 |         <th colspan="1" align="center"> 79.69/78.09 </th>
 50 |         <th colspan="1" align="center"> 61.72/63.8 </th>
 51 |         <th colspan="1" align="center"> 89.84/85.39 </th>
 52 |         <th colspan="1" align="center"> 76.56/77.76 </th>
 53 |     </tr>
 54 |     <tr>
 55 |         <th colspan="1" align="center"> <strong>Madry's Model <br/>Standard Accuracy</strong> <br/> (train/test) </th>
 56 |         <th colspan="1" align="center"> <strong>Madry's Model <br/>Robustness Accuracy</strong> <br/> (train/test) </th>
 57 |         <th colspan="1" align="center"> <strong>Madry's Model <br/>Standard Accuracy</strong> <br/> (train/test) </th>
 58 |         <th colspan="1" align="center"> <strong>Madry's Model <br/>Robustness Accuracy</strong> <br/> (train/test) </th>
 59 |         <th colspan="1" align="center"> <strong>Madry's Model <br/>Standard Accuracy</strong> <br/> (train/test) </th>
 60 |         <th colspan="1" align="center"> <strong>Madry's Model <br/>Robustness Accuracy</strong> <br/> (train/test) </th>
 61 |     </tr>
 62 |     <tr>
 63 |         <th colspan="1" align="center"> - </th>
 64 |         <th colspan="1" align="center"> - </th>
 65 |         <th colspan="1" align="center"> -/79.22 </th>
 66 |         <th colspan="1" align="center"> -/55.97 </th>
 67 |         <th colspan="1" align="center"> -/85.81 </th>
 68 |         <th colspan="1" align="center"> -/71.87 </th>
 69 |     </tr>
 70 |     </tbody>
 71 | </table>
 72 | 
 73 | (Only refer to those results which are not Madry's Model) Note that in testing mode, the target label used in creating the adversarial example is the most confident prediction of the model, not the ground truth. Therefore, sometimes the testing robustness is higher than training robustness, when the prediction is wrong at first. <br/>
 74 | 
 75 | Learning rate is manually changed during training: <br/>
 76 | 
 77 | * `0.1` in iteration `[0, 40000]`
 78 | * `0.01` in iteration `[40000, 60000]`
 79 | * `0.001` in iteration `[60000, 76000]`
 80 | 
 81 | the policy is followed https://github.com/MadryLab/cifar10_challenge.
 82 | 
 83 | 
 84 | ### Quantitative Results
 85 | 
 86 | * Defense model, standard accuracy = 86.66% (linf, epsilon=8/255) (train on PGD attack with 7 steps of size 2)
 87 | 
 88 | |   Attack                                      | Robust Test Accuracy  |
 89 | |   :---:                                       |  :---:                |
 90 | |   PGD with 10 steps of size 2 (cross-entropy) |    48.37%             |
 91 | |   PGD with 20 steps of size 1 (cross-entropy) |    48.04%             |
 92 | 
 93 | ### Visualization of Gradient with Respect to Input
 94 | 
 95 | ![visualization](https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_grad_default.jpg)
 96 | 
 97 | ### The Adversarial Example with large epsilon
 98 | 
 99 | The maximum epsilon is set to 4.7 (l2 norm) in this part.
100 | 
101 | ![large](https://github.com/louis2889184/adversarial_training/blob/master/cifar-10/img/cifar_large_l2_default.jpg)
102 | 
103 | 
104 | ## Requirements:
105 | ```
106 | python >= 3.5
107 | torch >= 1.0
108 | torchvision >= 0.2.1
109 | numpy >= 1.16.1
110 | matplotlib >= 3.0.2
111 | ```
112 | 
113 | ## Execution
114 | 
115 | ### Training
116 | 
117 | Standard training: <br/>
118 | 
119 | ```
120 | python main.py --data_root [data directory]
121 | ```
122 | 
123 | linf training: <br/>
124 | 
125 | ```
126 | python main.py --data_root [data directory] -e 0.0157 -p 'linf' --adv_train --affix 'linf'
127 | ```
128 | 
129 | l2 training: <br/>
130 | 
131 | ```
132 | python main.py --data_root [data directory] -e 0.314 -p 'l2' --adv_train --affix 'l2'
133 | ```
134 | 
135 | ### Testing
136 | 
137 | change the setting if you want to do linf testing.
138 | ```
139 | python main.py --todo test --data_root [data directory] -e 0.314 -p 'l2' --load_checkpoint [your_model.pth]
140 | ```
141 | 
142 | ### Visualization
143 | 
144 | change the setting in `visualize.py` `visualize_attack.py` and if you want to do linf visualization.
145 | 
146 | visualize gradient to input: <br/>
147 | 
148 | ```
149 | python visualize.py --load_checkpoint [your_model.pth]
150 | ```
151 | 
152 | visualize adversarial examples with larger epsilon <br/>
153 | 
154 | ```
155 | python visualize_attack.py --load_checkpoint [your_model.pth]
156 | ```
157 | 
158 | ## Checkpoints
159 | ### linf
160 | * epsilon=8/255, train on PGD attack with 7 steps of size 2: [checkpoint](https://drive.google.com/file/d/1-3AfpkLvPje5poY9ZettY05N8kZgFRAV/view?usp=sharing) <br/>
161 | 
162 | ## Training Time
163 | 
164 | Standard training: 78 s / 100 iterations <br/>
165 | Adversarial training: 784 s / 100 iterations <br/> <br/>
166 | 
167 | where the batch size is 128 and train on NVIDIA GeForce GTX 1080.
168 | 


--------------------------------------------------------------------------------
/cifar-10/img/cifar_grad_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_grad_default.jpg


--------------------------------------------------------------------------------
/cifar-10/img/cifar_large_l2_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_large_l2_default.jpg


--------------------------------------------------------------------------------
/cifar-10/img/cifar_large_linf_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_large_linf_default.jpg


--------------------------------------------------------------------------------
/cifar-10/img/cifar_learning_curve_l2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_l2.jpg


--------------------------------------------------------------------------------
/cifar-10/img/cifar_learning_curve_linf.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_linf.jpg


--------------------------------------------------------------------------------
/cifar-10/img/cifar_learning_curve_std.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/cifar-10/img/cifar_learning_curve_std.jpg


--------------------------------------------------------------------------------
/cifar-10/main.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import torch
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | from torch.utils.data import DataLoader
  6 | 
  7 | import torchvision as tv
  8 | 
  9 | from time import time
 10 | from src.model.madry_model import WideResNet
 11 | from src.attack import FastGradientSignUntargeted
 12 | from src.utils import makedirs, create_logger, tensor2cuda, numpy2cuda, evaluate, save_model
 13 | 
 14 | from src.argument import parser, print_args
 15 | 
 16 | class Trainer():
 17 |     def __init__(self, args, logger, attack):
 18 |         self.args = args
 19 |         self.logger = logger
 20 |         self.attack = attack
 21 | 
 22 |     def standard_train(self, model, tr_loader, va_loader=None):
 23 |         self.train(model, tr_loader, va_loader, False)
 24 | 
 25 |     def adversarial_train(self, model, tr_loader, va_loader=None):
 26 |         self.train(model, tr_loader, va_loader, True)
 27 | 
 28 |     def train(self, model, tr_loader, va_loader=None, adv_train=False):
 29 |         args = self.args
 30 |         logger = self.logger
 31 | 
 32 |         opt = torch.optim.SGD(model.parameters(), args.learning_rate, 
 33 |                               weight_decay=args.weight_decay,
 34 |                               momentum=args.momentum)
 35 |         scheduler = torch.optim.lr_scheduler.MultiStepLR(opt, 
 36 |                                                          milestones=[40000, 60000], 
 37 |                                                          gamma=0.1)
 38 |         _iter = 0
 39 | 
 40 |         begin_time = time()
 41 | 
 42 |         for epoch in range(1, args.max_epoch+1):
 43 |             for data, label in tr_loader:
 44 |                 data, label = tensor2cuda(data), tensor2cuda(label)
 45 | 
 46 |                 if adv_train:
 47 |                     # When training, the adversarial example is created from a random 
 48 |                     # close point to the original data point. If in evaluation mode, 
 49 |                     # just start from the original data point.
 50 |                     adv_data = self.attack.perturb(data, label, 'mean', True)
 51 |                     output = model(adv_data, _eval=False)
 52 |                 else:
 53 |                     output = model(data, _eval=False)
 54 | 
 55 |                 loss = F.cross_entropy(output, label)
 56 | 
 57 |                 opt.zero_grad()
 58 |                 loss.backward()
 59 |                 opt.step()
 60 | 
 61 |                 if _iter % args.n_eval_step == 0:
 62 |                     t1 = time()
 63 | 
 64 |                     if adv_train:
 65 |                         with torch.no_grad():
 66 |                             stand_output = model(data, _eval=True)
 67 |                         pred = torch.max(stand_output, dim=1)[1]
 68 | 
 69 |                         # print(pred)
 70 |                         std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 71 | 
 72 |                         pred = torch.max(output, dim=1)[1]
 73 |                         # print(pred)
 74 |                         adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 75 | 
 76 |                     else:
 77 |                         
 78 |                         adv_data = self.attack.perturb(data, label, 'mean', False)
 79 | 
 80 |                         with torch.no_grad():
 81 |                             adv_output = model(adv_data, _eval=True)
 82 |                         pred = torch.max(adv_output, dim=1)[1]
 83 |                         # print(label)
 84 |                         # print(pred)
 85 |                         adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 86 | 
 87 |                         pred = torch.max(output, dim=1)[1]
 88 |                         # print(pred)
 89 |                         std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 90 | 
 91 |                     t2 = time()
 92 | 
 93 |                     logger.info(f'epoch: {epoch}, iter: {_iter}, lr={opt.param_groups[0]["lr"]}, '
 94 |                                 f'spent {time()-begin_time:.2f} s, tr_loss: {loss.item():.3f}')
 95 | 
 96 |                     logger.info(f'standard acc: {std_acc:.3f}%, robustness acc: {adv_acc:.3f}%')
 97 | 
 98 |                     # begin_time = time()
 99 | 
100 |                     # if va_loader is not None:
101 |                     #     va_acc, va_adv_acc = self.test(model, va_loader, True)
102 |                     #     va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0
103 | 
104 |                     #     logger.info('\n' + '='*30 + ' evaluation ' + '='*30)
105 |                     #     logger.info('test acc: %.3f %%, test adv acc: %.3f %%, spent: %.3f' % (
106 |                     #         va_acc, va_adv_acc, time() - begin_time))
107 |                     #     logger.info('='*28 + ' end of evaluation ' + '='*28 + '\n')
108 | 
109 |                     begin_time = time()
110 | 
111 |                 if _iter % args.n_store_image_step == 0:
112 |                     tv.utils.save_image(torch.cat([data.cpu(), adv_data.cpu()], dim=0), 
113 |                                         os.path.join(args.log_folder, f'images_{_iter}.jpg'), 
114 |                                         nrow=16)
115 | 
116 |                 if _iter % args.n_checkpoint_step == 0:
117 |                     file_name = os.path.join(args.model_folder, f'checkpoint_{_iter}.pth')
118 |                     save_model(model, file_name)
119 | 
120 |                 _iter += 1
121 |                 # scheduler depends on training interation
122 |                 scheduler.step()
123 | 
124 |             if va_loader is not None:
125 |                 t1 = time()
126 |                 va_acc, va_adv_acc = self.test(model, va_loader, True, False)
127 |                 va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0
128 | 
129 |                 t2 = time()
130 |                 logger.info('\n'+'='*20 +f' evaluation at epoch: {epoch} iteration: {_iter} ' \
131 |                     +'='*20)
132 |                 logger.info(f'test acc: {va_acc:.3f}%, test adv acc: {va_adv_acc:.3f}%, spent: {t2-t1:.3f} s')
133 |                 logger.info('='*28+' end of evaluation '+'='*28+'\n')
134 | 
135 | 
136 |     def test(self, model, loader, adv_test=False, use_pseudo_label=False):
137 |         # adv_test is False, return adv_acc as -1 
138 | 
139 |         total_acc = 0.0
140 |         num = 0
141 |         total_adv_acc = 0.0
142 | 
143 |         with torch.no_grad():
144 |             for data, label in loader:
145 |                 data, label = tensor2cuda(data), tensor2cuda(label)
146 | 
147 |                 output = model(data, _eval=True)
148 | 
149 |                 pred = torch.max(output, dim=1)[1]
150 |                 te_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy(), 'sum')
151 |                 
152 |                 total_acc += te_acc
153 |                 num += output.shape[0]
154 | 
155 |                 if adv_test:
156 |                     # use predicted label as target label
157 |                     with torch.enable_grad():
158 |                         adv_data = self.attack.perturb(data, 
159 |                                                        pred if use_pseudo_label else label, 
160 |                                                        'mean', 
161 |                                                        False)
162 | 
163 |                     adv_output = model(adv_data, _eval=True)
164 | 
165 |                     adv_pred = torch.max(adv_output, dim=1)[1]
166 |                     adv_acc = evaluate(adv_pred.cpu().numpy(), label.cpu().numpy(), 'sum')
167 |                     total_adv_acc += adv_acc
168 |                 else:
169 |                     total_adv_acc = -num
170 | 
171 |         return total_acc / num , total_adv_acc / num
172 | 
173 | def main(args):
174 | 
175 |     save_folder = '%s_%s' % (args.dataset, args.affix)
176 | 
177 |     log_folder = os.path.join(args.log_root, save_folder)
178 |     model_folder = os.path.join(args.model_root, save_folder)
179 | 
180 |     makedirs(log_folder)
181 |     makedirs(model_folder)
182 | 
183 |     setattr(args, 'log_folder', log_folder)
184 |     setattr(args, 'model_folder', model_folder)
185 | 
186 |     logger = create_logger(log_folder, args.todo, 'info')
187 | 
188 |     print_args(args, logger)
189 | 
190 |     model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
191 | 
192 |     attack = FastGradientSignUntargeted(model, 
193 |                                         args.epsilon, 
194 |                                         args.alpha, 
195 |                                         min_val=0, 
196 |                                         max_val=1, 
197 |                                         max_iters=args.k, 
198 |                                         _type=args.perturbation_type)
199 | 
200 |     if torch.cuda.is_available():
201 |         model.cuda()
202 | 
203 |     trainer = Trainer(args, logger, attack)
204 | 
205 |     if args.todo == 'train':
206 |         transform_train = tv.transforms.Compose([
207 |                 tv.transforms.RandomCrop(32, padding=4, fill=0, padding_mode='constant'),
208 |                 tv.transforms.RandomHorizontalFlip(),
209 |                 tv.transforms.ToTensor(),
210 |             ])
211 |         tr_dataset = tv.datasets.CIFAR10(args.data_root, 
212 |                                        train=True, 
213 |                                        transform=transform_train, 
214 |                                        download=True)
215 | 
216 |         tr_loader = DataLoader(tr_dataset, batch_size=args.batch_size, shuffle=True, num_workers=4)
217 | 
218 |         # evaluation during training
219 |         te_dataset = tv.datasets.CIFAR10(args.data_root, 
220 |                                        train=False, 
221 |                                        transform=tv.transforms.ToTensor(), 
222 |                                        download=True)
223 | 
224 |         te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
225 | 
226 |         trainer.train(model, tr_loader, te_loader, args.adv_train)
227 |     elif args.todo == 'test':
228 |         te_dataset = tv.datasets.CIFAR10(args.data_root, 
229 |                                        train=False, 
230 |                                        transform=tv.transforms.ToTensor(), 
231 |                                        download=True)
232 | 
233 |         te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
234 | 
235 |         checkpoint = torch.load(args.load_checkpoint)
236 |         model.load_state_dict(checkpoint)
237 | 
238 |         std_acc, adv_acc = trainer.test(model, te_loader, adv_test=True, use_pseudo_label=False)
239 | 
240 |         print(f"std acc: {std_acc * 100:.3f}%, adv_acc: {adv_acc * 100:.3f}%")
241 | 
242 |     else:
243 |         raise NotImplementedError
244 |     
245 | 
246 | 
247 | 
248 | if __name__ == '__main__':
249 |     args = parser()
250 | 
251 |     os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
252 | 
253 |     main(args)


--------------------------------------------------------------------------------
/cifar-10/src/argument.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | def parser():
 4 |     parser = argparse.ArgumentParser(description='Video Summarization')
 5 |     parser.add_argument('--todo', choices=['train', 'valid', 'test', 'visualize'], default='train',
 6 |         help='what behavior want to do: train | valid | test | visualize')
 7 |     parser.add_argument('--dataset', default='cifar-10', help='use what dataset')
 8 |     parser.add_argument('--data_root', default='/home/yilin/Data', 
 9 |         help='the directory to save the dataset')
10 |     parser.add_argument('--log_root', default='log', 
11 |         help='the directory to save the logs or other imformations (e.g. images)')
12 |     parser.add_argument('--model_root', default='checkpoint', help='the directory to save the models')
13 |     parser.add_argument('--load_checkpoint', default='./model/default/model.pth')
14 |     parser.add_argument('--affix', default='default', help='the affix for the save folder')
15 | 
16 |     # parameters for generating adversarial examples
17 |     parser.add_argument('--epsilon', '-e', type=float, default=0.0157, 
18 |         help='maximum perturbation of adversaries (4/255=0.0157)')
19 |     parser.add_argument('--alpha', '-a', type=float, default=0.00784, 
20 |         help='movement multiplier per iteration when generating adversarial examples (2/255=0.00784)')
21 |     parser.add_argument('--k', '-k', type=int, default=10, 
22 |         help='maximum iteration when generating adversarial examples')
23 | 
24 |     parser.add_argument('--batch_size', '-b', type=int, default=128, help='batch size')
25 |     parser.add_argument('--max_epoch', '-m_e', type=int, default=200, 
26 |         help='the maximum numbers of the model see a sample')
27 |     parser.add_argument('--learning_rate', '-lr', type=float, default=0.1, help='learning rate')
28 |     parser.add_argument('--momentum', '-m', type=float, default=0.9, help='momentum for optimizer')
29 |     parser.add_argument('--weight_decay', '-w', type=float, default=2e-4, 
30 |         help='the parameter of l2 restriction for weights')
31 | 
32 |     parser.add_argument('--gpu', '-g', default='0', help='which gpu to use')
33 |     parser.add_argument('--n_eval_step', type=int, default=100, 
34 |         help='number of iteration per one evaluation')
35 |     parser.add_argument('--n_checkpoint_step', type=int, default=4000, 
36 |         help='number of iteration to save a checkpoint')
37 |     parser.add_argument('--n_store_image_step', type=int, default=4000, 
38 |         help='number of iteration to save adversaries')
39 |     parser.add_argument('--perturbation_type', '-p', choices=['linf', 'l2'], default='linf', 
40 |         help='the type of the perturbation (linf or l2)')
41 |     
42 |     parser.add_argument('--adv_train', action='store_true')
43 | 
44 |     return parser.parse_args()
45 | 
46 | def print_args(args, logger=None):
47 |     for k, v in vars(args).items():
48 |         if logger is not None:
49 |             logger.info('{:<16} : {}'.format(k, v))
50 |         else:
51 |             print('{:<16} : {}'.format(k, v))


--------------------------------------------------------------------------------
/cifar-10/src/attack/__init__.py:
--------------------------------------------------------------------------------
1 | from .fast_gradient_sign_untargeted import FastGradientSignUntargeted


--------------------------------------------------------------------------------
/cifar-10/src/attack/fast_gradient_sign_untargeted.py:
--------------------------------------------------------------------------------
  1 | """
  2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks
  3 | 
  4 | original author: Utku Ozbulak - github.com/utkuozbulak
  5 | """
  6 | import sys
  7 | sys.path.append("..")
  8 | 
  9 | import os
 10 | import numpy as np
 11 | 
 12 | import torch
 13 | from torch import nn
 14 | import torch.nn.functional as F
 15 | 
 16 | from src.utils import tensor2cuda
 17 | 
 18 | def project(x, original_x, epsilon, _type='linf'):
 19 | 
 20 |     if _type == 'linf':
 21 |         max_x = original_x + epsilon
 22 |         min_x = original_x - epsilon
 23 | 
 24 |         x = torch.max(torch.min(x, max_x), min_x)
 25 | 
 26 |     elif _type == 'l2':
 27 |         dist = (x - original_x)
 28 | 
 29 |         dist = dist.view(x.shape[0], -1)
 30 | 
 31 |         dist_norm = torch.norm(dist, dim=1, keepdim=True)
 32 | 
 33 |         mask = (dist_norm > epsilon).unsqueeze(2).unsqueeze(3)
 34 | 
 35 |         # dist = F.normalize(dist, p=2, dim=1)
 36 | 
 37 |         dist = dist / dist_norm
 38 | 
 39 |         dist *= epsilon
 40 | 
 41 |         dist = dist.view(x.shape)
 42 | 
 43 |         x = (original_x + dist) * mask.float() + x * (1 - mask.float())
 44 | 
 45 |     else:
 46 |         raise NotImplementedError
 47 | 
 48 |     return x
 49 | 
 50 | class FastGradientSignUntargeted():
 51 |     b"""
 52 |         Fast gradient sign untargeted adversarial attack, minimizes the initial class activation
 53 |         with iterative grad sign updates
 54 |     """
 55 |     def __init__(self, model, epsilon, alpha, min_val, max_val, max_iters, _type='linf'):
 56 |         self.model = model
 57 |         # self.model.eval()
 58 | 
 59 |         # Maximum perturbation
 60 |         self.epsilon = epsilon
 61 |         # Movement multiplier per iteration
 62 |         self.alpha = alpha
 63 |         # Minimum value of the pixels
 64 |         self.min_val = min_val
 65 |         # Maximum value of the pixels
 66 |         self.max_val = max_val
 67 |         # Maximum numbers of iteration to generated adversaries
 68 |         self.max_iters = max_iters
 69 |         # The perturbation of epsilon
 70 |         self._type = _type
 71 |         
 72 |     def perturb(self, original_images, labels, reduction4loss='mean', random_start=False):
 73 |         # original_images: values are within self.min_val and self.max_val
 74 | 
 75 |         # The adversaries created from random close points to the original data
 76 |         if random_start:
 77 |             rand_perturb = torch.FloatTensor(original_images.shape).uniform_(
 78 |                 -self.epsilon, self.epsilon)
 79 |             rand_perturb = tensor2cuda(rand_perturb)
 80 |             x = original_images + rand_perturb
 81 |             x.clamp_(self.min_val, self.max_val)
 82 |         else:
 83 |             x = original_images.clone()
 84 | 
 85 |         x.requires_grad = True 
 86 | 
 87 |         # max_x = original_images + self.epsilon
 88 |         # min_x = original_images - self.epsilon
 89 | 
 90 |         self.model.eval()
 91 | 
 92 |         with torch.enable_grad():
 93 |             for _iter in range(self.max_iters):
 94 |                 outputs = self.model(x, _eval=True)
 95 | 
 96 |                 loss = F.cross_entropy(outputs, labels, reduction=reduction4loss)
 97 | 
 98 |                 if reduction4loss == 'none':
 99 |                     grad_outputs = tensor2cuda(torch.ones(loss.shape))
100 |                     
101 |                 else:
102 |                     grad_outputs = None
103 | 
104 |                 grads = torch.autograd.grad(loss, x, grad_outputs=grad_outputs, 
105 |                         only_inputs=True)[0]
106 | 
107 |                 x.data += self.alpha * torch.sign(grads.data) 
108 | 
109 |                 # the adversaries' pixel value should within max_x and min_x due 
110 |                 # to the l_infinity / l2 restriction
111 |                 x = project(x, original_images, self.epsilon, self._type)
112 |                 # the adversaries' value should be valid pixel value
113 |                 x.clamp_(self.min_val, self.max_val)
114 | 
115 |         self.model.train()
116 | 
117 |         return x
118 | 


--------------------------------------------------------------------------------
/cifar-10/src/model/__init__.py:
--------------------------------------------------------------------------------
1 | from .model import *


--------------------------------------------------------------------------------
/cifar-10/src/model/madry_model.py:
--------------------------------------------------------------------------------
  1 | # codes are import from https://github.com/xternalz/WideResNet-pytorch/blob/master/wideresnet.py
  2 | # original author: xternalz
  3 | 
  4 | import math
  5 | import torch
  6 | import torch.nn as nn
  7 | import torch.nn.functional as F
  8 | 
  9 | from src.utils import count_parameters
 10 | 
 11 | class Expression(nn.Module):
 12 |     def __init__(self, func):
 13 |         super(Expression, self).__init__()
 14 |         self.func = func
 15 |     
 16 |     def forward(self, input):
 17 |         return self.func(input)
 18 | 
 19 | class Model(nn.Module):
 20 |     def __init__(self, i_c=1, n_c=10):
 21 |         super(Model, self).__init__()
 22 | 
 23 |         self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True)
 24 |         self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
 25 | 
 26 |         self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True)
 27 |         self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
 28 | 
 29 | 
 30 |         self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1))
 31 |         self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True)
 32 |         self.fc2 = nn.Linear(1024, n_c)
 33 | 
 34 | 
 35 |     def forward(self, x_i, _eval=False):
 36 | 
 37 |         if _eval:
 38 |             # switch to eval mode
 39 |             self.eval()
 40 |         else:
 41 |             self.train()
 42 |             
 43 |         x_o = self.conv1(x_i)
 44 |         x_o = torch.relu(x_o)
 45 |         x_o = self.pool1(x_o)
 46 | 
 47 |         x_o = self.conv2(x_o)
 48 |         x_o = torch.relu(x_o)
 49 |         x_o = self.pool2(x_o)
 50 | 
 51 |         x_o = self.flatten(x_o)
 52 | 
 53 |         x_o = torch.relu(self.fc1(x_o))
 54 | 
 55 |         self.train()
 56 | 
 57 |         return self.fc2(x_o)
 58 | 
 59 | class ChannelPadding(nn.Module):
 60 |     def __init__(self, in_planes, out_planes):
 61 |         super(ChannelPadding, self).__init__()
 62 | 
 63 |         self.register_buffer("padding", 
 64 |                              torch.zeros((out_planes - in_planes) // 2).view(1, -1, 1, 1))
 65 | 
 66 |     def forward(self, input):
 67 |         assert len(input.size()) == 4, "only support for 4-D tensor for now"
 68 | 
 69 |         padding = self.padding.expand(input.size(0), -1, input.size(2), input.size(3))
 70 | 
 71 |         return torch.cat([padding, input, padding], dim=1)
 72 | 
 73 | class BasicBlock(nn.Module):
 74 |     def __init__(self, in_planes, out_planes, stride, dropRate=0.0):
 75 |         super(BasicBlock, self).__init__()
 76 |         self.bn1 = nn.BatchNorm2d(in_planes)
 77 |         self.relu1 = nn.LeakyReLU(0.1, inplace=True)
 78 |         self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 79 |                                padding=1, bias=False)
 80 |         self.bn2 = nn.BatchNorm2d(out_planes)
 81 |         self.relu2 = nn.LeakyReLU(0.1, inplace=True)
 82 |         self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,
 83 |                                padding=1, bias=False)
 84 |         self.droprate = dropRate
 85 |         self.equalInOut = (in_planes == out_planes)
 86 |         # self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
 87 |         #                        padding=0, bias=False) or None
 88 |         self.poolpadShortcut = nn.Sequential(
 89 |             nn.AvgPool2d(kernel_size=stride, stride=stride),
 90 |             ChannelPadding(in_planes, out_planes)
 91 |         )
 92 |     def forward(self, x):
 93 |         if not self.equalInOut:
 94 |             x = self.relu1(self.bn1(x))
 95 |         else:
 96 |             out = self.relu1(self.bn1(x))
 97 |         out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))
 98 |         if self.droprate > 0:
 99 |             out = F.dropout(out, p=self.droprate, training=self.training)
100 |         out = self.conv2(out)
101 |         # return torch.add(x if self.equalInOut else self.convShortcut(x), out)
102 |         return torch.add(
103 |             x if self.equalInOut else self.poolpadShortcut(x),
104 |             out
105 |         )
106 | 
107 | class NetworkBlock(nn.Module):
108 |     def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):
109 |         super(NetworkBlock, self).__init__()
110 |         self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)
111 |     def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):
112 |         layers = []
113 |         for i in range(int(nb_layers)):
114 |             layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))
115 |         return nn.Sequential(*layers)
116 |     def forward(self, x):
117 |         return self.layer(x)
118 | 
119 | class WideResNet(nn.Module):
120 |     def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0):
121 |         super(WideResNet, self).__init__()
122 |         nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor]
123 |         assert((depth - 4) % 6 == 0)
124 |         n = (depth - 4) / 6
125 |         block = BasicBlock
126 |         # 1st conv before any network block
127 |         self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,
128 |                                padding=1, bias=False)
129 |         # 1st block
130 |         self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)
131 |         # 2nd block
132 |         self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)
133 |         # 3rd block
134 |         self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)
135 |         # global average pooling and classifier
136 |         self.bn1 = nn.BatchNorm2d(nChannels[3])
137 |         self.relu = nn.LeakyReLU(0.1, inplace=True)
138 |         self.fc = nn.Linear(nChannels[3], num_classes)
139 |         self.nChannels = nChannels[3]
140 | 
141 |         for m in self.modules():
142 |             if isinstance(m, nn.Conv2d):
143 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
144 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
145 |             elif isinstance(m, nn.BatchNorm2d):
146 |                 m.weight.data.fill_(1)
147 |                 m.bias.data.zero_()
148 |             elif isinstance(m, nn.Linear):
149 |                 m.bias.data.zero_()
150 | 
151 |     def forward(self, x, _eval=False):
152 |         if _eval:
153 |             # switch to eval mode
154 |             self.eval()
155 |         else:
156 |             self.train()
157 | 
158 |         out = self.conv1(x)
159 |         out = self.block1(out)
160 |         out = self.block2(out)
161 |         out = self.block3(out)
162 |         out = self.relu(self.bn1(out))
163 |         out = F.avg_pool2d(out, 8)
164 |         out = out.view(-1, self.nChannels)
165 | 
166 |         self.train()
167 | 
168 |         return self.fc(out)
169 | 
170 | 
171 | if __name__ == '__main__':
172 |     i = torch.FloatTensor(4, 3, 32, 32)
173 | 
174 |     n = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
175 | 
176 |     i = i.cuda()
177 |     n = n.cuda()
178 | 
179 |     print(n(i).size())
180 | 
181 |     print(count_parameters(n))
182 | 
183 | 


--------------------------------------------------------------------------------
/cifar-10/src/model/model.py:
--------------------------------------------------------------------------------
  1 | # codes are import from https://github.com/xternalz/WideResNet-pytorch/blob/master/wideresnet.py
  2 | # original author: xternalz
  3 | 
  4 | import math
  5 | import torch
  6 | import torch.nn as nn
  7 | import torch.nn.functional as F
  8 | 
  9 | from src.utils import count_parameters
 10 | 
 11 | class Expression(nn.Module):
 12 |     def __init__(self, func):
 13 |         super(Expression, self).__init__()
 14 |         self.func = func
 15 |     
 16 |     def forward(self, input):
 17 |         return self.func(input)
 18 | 
 19 | class Model(nn.Module):
 20 |     def __init__(self, i_c=1, n_c=10):
 21 |         super(Model, self).__init__()
 22 | 
 23 |         self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True)
 24 |         self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
 25 | 
 26 |         self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True)
 27 |         self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
 28 | 
 29 | 
 30 |         self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1))
 31 |         self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True)
 32 |         self.fc2 = nn.Linear(1024, n_c)
 33 | 
 34 | 
 35 |     def forward(self, x_i, _eval=False):
 36 | 
 37 |         if _eval:
 38 |             # switch to eval mode
 39 |             self.eval()
 40 |         else:
 41 |             self.train()
 42 |             
 43 |         x_o = self.conv1(x_i)
 44 |         x_o = torch.relu(x_o)
 45 |         x_o = self.pool1(x_o)
 46 | 
 47 |         x_o = self.conv2(x_o)
 48 |         x_o = torch.relu(x_o)
 49 |         x_o = self.pool2(x_o)
 50 | 
 51 |         x_o = self.flatten(x_o)
 52 | 
 53 |         x_o = torch.relu(self.fc1(x_o))
 54 | 
 55 |         self.train()
 56 | 
 57 |         return self.fc2(x_o)
 58 | 
 59 | 
 60 | 
 61 | 
 62 | 
 63 | class BasicBlock(nn.Module):
 64 |     def __init__(self, in_planes, out_planes, stride, dropRate=0.0):
 65 |         super(BasicBlock, self).__init__()
 66 |         self.bn1 = nn.BatchNorm2d(in_planes)
 67 |         self.relu1 = nn.LeakyReLU(0.1, inplace=True)
 68 |         self.conv1 = nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride,
 69 |                                padding=1, bias=False)
 70 |         self.bn2 = nn.BatchNorm2d(out_planes)
 71 |         self.relu2 = nn.LeakyReLU(0.1, inplace=True)
 72 |         self.conv2 = nn.Conv2d(out_planes, out_planes, kernel_size=3, stride=1,
 73 |                                padding=1, bias=False)
 74 |         self.droprate = dropRate
 75 |         self.equalInOut = (in_planes == out_planes)
 76 |         self.convShortcut = (not self.equalInOut) and nn.Conv2d(in_planes, out_planes, kernel_size=1, stride=stride,
 77 |                                padding=0, bias=False) or None
 78 |     def forward(self, x):
 79 |         if not self.equalInOut:
 80 |             x = self.relu1(self.bn1(x))
 81 |         else:
 82 |             out = self.relu1(self.bn1(x))
 83 |         out = self.relu2(self.bn2(self.conv1(out if self.equalInOut else x)))
 84 |         if self.droprate > 0:
 85 |             out = F.dropout(out, p=self.droprate, training=self.training)
 86 |         out = self.conv2(out)
 87 |         return torch.add(x if self.equalInOut else self.convShortcut(x), out)
 88 | 
 89 | class NetworkBlock(nn.Module):
 90 |     def __init__(self, nb_layers, in_planes, out_planes, block, stride, dropRate=0.0):
 91 |         super(NetworkBlock, self).__init__()
 92 |         self.layer = self._make_layer(block, in_planes, out_planes, nb_layers, stride, dropRate)
 93 |     def _make_layer(self, block, in_planes, out_planes, nb_layers, stride, dropRate):
 94 |         layers = []
 95 |         for i in range(int(nb_layers)):
 96 |             layers.append(block(i == 0 and in_planes or out_planes, out_planes, i == 0 and stride or 1, dropRate))
 97 |         return nn.Sequential(*layers)
 98 |     def forward(self, x):
 99 |         return self.layer(x)
100 | 
101 | class WideResNet(nn.Module):
102 |     def __init__(self, depth, num_classes, widen_factor=1, dropRate=0.0):
103 |         super(WideResNet, self).__init__()
104 |         nChannels = [16, 16*widen_factor, 32*widen_factor, 64*widen_factor]
105 |         assert((depth - 4) % 6 == 0)
106 |         n = (depth - 4) / 6
107 |         block = BasicBlock
108 |         # 1st conv before any network block
109 |         self.conv1 = nn.Conv2d(3, nChannels[0], kernel_size=3, stride=1,
110 |                                padding=1, bias=False)
111 |         # 1st block
112 |         self.block1 = NetworkBlock(n, nChannels[0], nChannels[1], block, 1, dropRate)
113 |         # 2nd block
114 |         self.block2 = NetworkBlock(n, nChannels[1], nChannels[2], block, 2, dropRate)
115 |         # 3rd block
116 |         self.block3 = NetworkBlock(n, nChannels[2], nChannels[3], block, 2, dropRate)
117 |         # global average pooling and classifier
118 |         self.bn1 = nn.BatchNorm2d(nChannels[3])
119 |         self.relu = nn.LeakyReLU(0.1, inplace=True)
120 |         self.fc = nn.Linear(nChannels[3], num_classes)
121 |         self.nChannels = nChannels[3]
122 | 
123 |         for m in self.modules():
124 |             if isinstance(m, nn.Conv2d):
125 |                 n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
126 |                 m.weight.data.normal_(0, math.sqrt(2. / n))
127 |             elif isinstance(m, nn.BatchNorm2d):
128 |                 m.weight.data.fill_(1)
129 |                 m.bias.data.zero_()
130 |             elif isinstance(m, nn.Linear):
131 |                 m.bias.data.zero_()
132 | 
133 |     def forward(self, x, _eval=False):
134 |         if _eval:
135 |             # switch to eval mode
136 |             self.eval()
137 |         else:
138 |             self.train()
139 | 
140 |         out = self.conv1(x)
141 |         out = self.block1(out)
142 |         out = self.block2(out)
143 |         out = self.block3(out)
144 |         out = self.relu(self.bn1(out))
145 |         out = F.avg_pool2d(out, 8)
146 |         out = out.view(-1, self.nChannels)
147 | 
148 |         self.train()
149 | 
150 |         return self.fc(out)
151 | 
152 | 
153 | if __name__ == '__main__':
154 |     i = torch.FloatTensor(4, 3, 32, 32)
155 | 
156 |     n = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
157 | 
158 |     # print(n(i).size())
159 | 
160 |     print(count_parameters(n))
161 | 
162 | 


--------------------------------------------------------------------------------
/cifar-10/src/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .utils import *


--------------------------------------------------------------------------------
/cifar-10/src/utils/utils.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import json
  3 | import logging
  4 | 
  5 | import numpy as np
  6 | 
  7 | import torch
  8 | 
  9 | class LabelDict():
 10 |     def __init__(self, dataset='cifar-10'):
 11 |         self.dataset = dataset
 12 |         if dataset == 'cifar-10':
 13 |             self.label_dict = {0: 'airplane', 1: 'automobile', 2: 'bird', 3: 'cat', 
 14 |                          4: 'deer',     5: 'dog',        6: 'frog', 7: 'horse',
 15 |                          8: 'ship',     9: 'truck'}
 16 | 
 17 |         self.class_dict = {v: k for k, v in self.label_dict.items()}
 18 | 
 19 |     def label2class(self, label):
 20 |         assert label in self.label_dict, 'the label %d is not in %s' % (label, self.dataset)
 21 |         return self.label_dict[label]
 22 | 
 23 |     def class2label(self, _class):
 24 |         assert isinstance(_class, str)
 25 |         assert _class in self.class_dict, 'the class %s is not in %s' % (_class, self.dataset)
 26 |         return self.class_dict[_class]
 27 | 
 28 | def list2cuda(_list):
 29 |     array = np.array(_list)
 30 |     return numpy2cuda(array)
 31 | 
 32 | def numpy2cuda(array):
 33 |     tensor = torch.from_numpy(array)
 34 | 
 35 |     return tensor2cuda(tensor)
 36 | 
 37 | def tensor2cuda(tensor):
 38 |     if torch.cuda.is_available():
 39 |         tensor = tensor.cuda()
 40 | 
 41 |     return tensor
 42 | 
 43 | def one_hot(ids, n_class):
 44 |     # --------------------- 
 45 |     # author：ke1th 
 46 |     # source：CSDN 
 47 |     # artical：https://blog.csdn.net/u012436149/article/details/77017832 
 48 |     b"""
 49 |     ids: (list, ndarray) shape:[batch_size]
 50 |     out_tensor:FloatTensor shape:[batch_size, depth]
 51 |     """
 52 | 
 53 |     assert len(ids.shape) == 1, 'the ids should be 1-D'
 54 |     # ids = torch.LongTensor(ids).view(-1,1) 
 55 | 
 56 |     out_tensor = torch.zeros(len(ids), n_class)
 57 | 
 58 |     out_tensor.scatter_(1, ids.cpu().unsqueeze(1), 1.)
 59 | 
 60 |     return out_tensor
 61 |     
 62 | def evaluate(_input, _target, method='mean'):
 63 |     correct = (_input == _target).astype(np.float32)
 64 |     if method == 'mean':
 65 |         return correct.mean()
 66 |     else:
 67 |         return correct.sum()
 68 | 
 69 | 
 70 | def create_logger(save_path='', file_type='', level='debug'):
 71 | 
 72 |     if level == 'debug':
 73 |         _level = logging.DEBUG
 74 |     elif level == 'info':
 75 |         _level = logging.INFO
 76 | 
 77 |     logger = logging.getLogger()
 78 |     logger.setLevel(_level)
 79 | 
 80 |     cs = logging.StreamHandler()
 81 |     cs.setLevel(_level)
 82 |     logger.addHandler(cs)
 83 | 
 84 |     if save_path != '':
 85 |         file_name = os.path.join(save_path, file_type + '_log.txt')
 86 |         fh = logging.FileHandler(file_name, mode='w')
 87 |         fh.setLevel(_level)
 88 | 
 89 |         logger.addHandler(fh)
 90 | 
 91 |     return logger
 92 | 
 93 | def makedirs(path):
 94 |     if not os.path.exists(path):
 95 |         os.makedirs(path)
 96 | 
 97 | def load_model(model, file_name):
 98 |     model.load_state_dict(
 99 |             torch.load(file_name, map_location=lambda storage, loc: storage))
100 | 
101 | def save_model(model, file_name):
102 |     torch.save(model.state_dict(), file_name)
103 | 
104 | def count_parameters(model):
105 |     # copy from https://discuss.pytorch.org/t/how-do-i-check-the-number-of-parameters-of-a-model/4325/8
106 |     # baldassarre.fe's reply
107 |     return sum(p.numel() for p in model.parameters() if p.requires_grad)


--------------------------------------------------------------------------------
/cifar-10/src/visualization/__init__.py:
--------------------------------------------------------------------------------
1 | from .vanilla_backprop import VanillaBackprop


--------------------------------------------------------------------------------
/cifar-10/src/visualization/vanilla_backprop.py:
--------------------------------------------------------------------------------
 1 | """
 2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-visualizations
 3 | 
 4 | original author: Utku Ozbulak - github.com/utkuozbulak
 5 | """
 6 | 
 7 | import sys
 8 | sys.path.append("..")
 9 | 
10 | import torch
11 | 
12 | from src.utils import tensor2cuda, one_hot
13 | 
14 | class VanillaBackprop():
15 |     """
16 |         Produces gradients generated with vanilla back propagation from the image
17 |     """
18 |     def __init__(self, model):
19 |         self.model = model
20 | 
21 |     def generate_gradients(self, input_image, target_class):
22 |         # Put model in evaluation mode
23 |         self.model.eval()
24 | 
25 |         x = input_image.clone()
26 | 
27 |         x.requires_grad = True
28 | 
29 |         with torch.enable_grad():
30 |             # Forward
31 |             model_output = self.model(x)
32 |             # Zero grads
33 |             self.model.zero_grad()
34 |             
35 |             grad_outputs = one_hot(target_class, model_output.shape[1])
36 |             grad_outputs = tensor2cuda(grad_outputs)
37 | 
38 |             grad = torch.autograd.grad(model_output, x, grad_outputs=grad_outputs, 
39 |                         only_inputs=True)[0]
40 | 
41 |             self.model.train()
42 | 
43 |         return grad
44 | 


--------------------------------------------------------------------------------
/cifar-10/train.sh:
--------------------------------------------------------------------------------
1 | python -m src.main.py --data_root '.' --affix std
2 | python -m src.main.py --data_root '.' -e 0.0157 -p 'linf' --adv_train --affix 'linf'
3 | python -m src.main.py --data_root '.' -e 0.314 -p 'l2' --adv_train --affix 'l2'
4 | 


--------------------------------------------------------------------------------
/cifar-10/visualize.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import torch
  4 | import torchvision as tv
  5 | import numpy as np
  6 | 
  7 | from torch.utils.data import DataLoader
  8 | 
  9 | from src.utils import makedirs, tensor2cuda, load_model, LabelDict
 10 | from argument import parser
 11 | from src.visualization import VanillaBackprop
 12 | from src.model.madry_model import WideResNet
 13 | 
 14 | import matplotlib.pyplot as plt 
 15 | 
 16 | img_folder = 'img'
 17 | makedirs(img_folder)
 18 | out_num = 5
 19 | 
 20 | 
 21 | args = parser()
 22 | 
 23 | label_dict = LabelDict(args.dataset)
 24 | 
 25 | te_dataset = tv.datasets.CIFAR10(args.data_root, 
 26 |                                train=False, 
 27 |                                transform=tv.transforms.ToTensor(), 
 28 |                                download=True)
 29 | 
 30 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
 31 | 
 32 | 
 33 | for data, label in te_loader:
 34 | 
 35 |     data, label = tensor2cuda(data), tensor2cuda(label)
 36 | 
 37 | 
 38 |     break
 39 | 
 40 | 
 41 | model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
 42 | 
 43 | load_model(model, args.load_checkpoint)
 44 | 
 45 | if torch.cuda.is_available():
 46 |     model.cuda()
 47 | 
 48 | VBP = VanillaBackprop(model)
 49 | 
 50 | grad = VBP.generate_gradients(data, label)
 51 | 
 52 | grad_flat = grad.view(grad.shape[0], -1)
 53 | mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3)
 54 | std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3)
 55 | 
 56 | mean = mean.repeat(1, 1, data.shape[2], data.shape[3])
 57 | std = std.repeat(1, 1, data.shape[2], data.shape[3])
 58 | 
 59 | grad = torch.max(torch.min(grad, mean+3*std), mean-3*std)
 60 | 
 61 | print(grad.min(), grad.max())
 62 | 
 63 | grad -= grad.min()
 64 | 
 65 | grad /= grad.max()
 66 | 
 67 | grad = grad.cpu().numpy().squeeze()  # (N, 28, 28)
 68 | 
 69 | grad *= 255.0
 70 | 
 71 | label = label.cpu().numpy()
 72 | 
 73 | data = data.cpu().numpy().squeeze()
 74 | 
 75 | data *= 255.0
 76 | 
 77 | out_list = [data, grad]
 78 | 
 79 | types = ['Original', 'Your Model']
 80 | 
 81 | fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num)
 82 | 
 83 | axs = _axs
 84 | 
 85 | for j, _type in enumerate(types):
 86 |     axs[j, 0].set_ylabel(_type)
 87 | 
 88 |     # if j == 0:
 89 |     #     cmap = 'gray'
 90 |     # else:
 91 |     #     cmap = 'seismic'
 92 | 
 93 |     for i in range(out_num):
 94 |         axs[j, i].set_xlabel('%s' % label_dict.label2class(label[i]))
 95 |         img = out_list[j][i]
 96 |         # print(img)
 97 |         img = np.transpose(img, (1, 2, 0))
 98 | 
 99 |         img = img.astype(np.uint8)
100 |         axs[j, i].imshow(img)
101 | 
102 |         axs[j, i].get_xaxis().set_ticks([])
103 |         axs[j, i].get_yaxis().set_ticks([])
104 | 
105 | plt.tight_layout()
106 | plt.savefig(os.path.join(img_folder, 'cifar_grad_%s.jpg' % args.affix))
107 | 
108 | # types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
109 | 
110 | 
111 | # model_checkpoints = ['checkpoint/cifar-10_std/checkpoint_76000.pth',
112 | #                      'checkpoint/cifar-10_linf/checkpoint_76000.pth', 
113 | #                      'checkpoint/cifar-10_l2/checkpoint_76000.pth']
114 | 
115 | 
116 | # out_list = []
117 | 
118 | # for checkpoint in model_checkpoints:
119 | 
120 | #     model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
121 | 
122 | #     load_model(model, checkpoint)
123 | 
124 | #     if torch.cuda.is_available():
125 | #         model.cuda()
126 | 
127 | #     VBP = VanillaBackprop(model)
128 | 
129 | #     grad = VBP.generate_gradients(data, label)
130 | 
131 | #     grad_flat = grad.view(grad.shape[0], -1)
132 | #     mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3)
133 | #     std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3)
134 | 
135 | #     mean = mean.repeat(1, 1, data.shape[2], data.shape[3])
136 | #     std = std.repeat(1, 1, data.shape[2], data.shape[3])
137 | 
138 | #     grad = torch.max(torch.min(grad, mean+3*std), mean-3*std)
139 | 
140 | #     print(grad.min(), grad.max())
141 | 
142 | #     grad -= grad.min()
143 | 
144 | #     grad /= grad.max()
145 | 
146 | #     grad = grad.cpu().numpy().squeeze()  # (N, 28, 28)
147 | 
148 | #     grad *= 255.0
149 | 
150 | #     out_list.append(grad)
151 | 
152 | # data = data.cpu().numpy().squeeze()  # (N, 28, 28)
153 | # data *= 255.0
154 | # label = label.cpu().numpy()
155 | 
156 | # out_list.insert(0, data)
157 | 
158 | # # normalize the grad
159 | # # length = torch.norm(grad, dim=3)
160 | # # length = torch.norm(length, dim=2)
161 | # # length = length.unsqueeze(2).unsqueeze(2)
162 | # # grad /= (length + 1e-5)
163 | 
164 | # out_num = 5
165 | 
166 | # fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num)
167 | 
168 | # axs = _axs
169 | 
170 | 
171 | # for j, _type in enumerate(types):
172 | #     axs[j, 0].set_ylabel(_type)
173 | 
174 | #     # if j == 0:
175 | #     #     cmap = 'gray'
176 | #     # else:
177 | #     #     cmap = 'seismic'
178 | 
179 | #     for i in range(out_num):
180 | 
181 | #         data_id = i + 0
182 | 
183 | #         axs[j, i].set_xlabel('%s' % label_dict.label2class(label[data_id]))
184 |         
185 | #         img = out_list[j][data_id]
186 | #         # print(img)
187 | #         img = np.transpose(img, (1, 2, 0))
188 | 
189 | #         img = img.astype(np.uint8)
190 | #         axs[j, i].imshow(img)
191 | 
192 | #         axs[j, i].get_xaxis().set_ticks([])
193 | #         axs[j, i].get_yaxis().set_ticks([])
194 | 
195 | # plt.tight_layout()
196 | # plt.savefig(os.path.join(img_folder, 'cifar_grad_%s.jpg' % args.affix))


--------------------------------------------------------------------------------
/cifar-10/visualize_attack.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import torch
  4 | import torchvision as tv
  5 | import numpy as np
  6 | 
  7 | from torch.utils.data import DataLoader
  8 | 
  9 | from src.utils import makedirs, tensor2cuda, load_model, LabelDict
 10 | from argument import parser
 11 | from src.visualization import VanillaBackprop
 12 | from src.attack import FastGradientSignUntargeted
 13 | from src.model.madry_model import WideResNet
 14 | 
 15 | import matplotlib.pyplot as plt 
 16 | 
 17 | max_epsilon = 4.7
 18 | 
 19 | perturbation_type = 'l2'
 20 | 
 21 | out_num = 5
 22 | 
 23 | img_folder = 'img'
 24 | makedirs(img_folder)
 25 | 
 26 | args = parser()
 27 | 
 28 | label_dict = LabelDict(args.dataset)
 29 | 
 30 | te_dataset = tv.datasets.CIFAR10(args.data_root, 
 31 |                                train=False, 
 32 |                                transform=tv.transforms.ToTensor(), 
 33 |                                download=True)
 34 | 
 35 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
 36 | 
 37 | 
 38 | for data, label in te_loader:
 39 | 
 40 |     data, label = tensor2cuda(data), tensor2cuda(label)
 41 | 
 42 | 
 43 |     break
 44 | 
 45 | 
 46 | adv_list = []
 47 | pred_list = []
 48 | 
 49 | with torch.no_grad():
 50 | 
 51 |     model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
 52 | 
 53 |     load_model(model, args.load_checkpoint)
 54 | 
 55 |     if torch.cuda.is_available():
 56 |         model.cuda()
 57 | 
 58 |     attack = FastGradientSignUntargeted(model, 
 59 |                                         max_epsilon, 
 60 |                                         args.alpha, 
 61 |                                         min_val=0, 
 62 |                                         max_val=1, 
 63 |                                         max_iters=args.k, 
 64 |                                         _type=perturbation_type)
 65 | 
 66 |    
 67 |     adv_data = attack.perturb(data, label, 'mean', False)
 68 | 
 69 |     output = model(adv_data, _eval=True)
 70 |     pred = torch.max(output, dim=1)[1]
 71 |     adv_list.append(adv_data.cpu().numpy().squeeze() * 255.0)  # (N, 28, 28)
 72 |     pred_list.append(pred.cpu().numpy())
 73 | 
 74 | data = data.cpu().numpy().squeeze()  # (N, 28, 28)
 75 | data *= 255.0
 76 | label = label.cpu().numpy()
 77 | 
 78 | adv_list.insert(0, data)
 79 | 
 80 | pred_list.insert(0, label)
 81 | 
 82 | 
 83 | types = ['Original', 'Your Model']
 84 | 
 85 | fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num)
 86 | 
 87 | axs = _axs
 88 | 
 89 | for j, _type in enumerate(types):
 90 |     axs[j, 0].set_ylabel(_type)
 91 | 
 92 |     for i in range(out_num):
 93 |         axs[j, i].set_xlabel('%s' % label_dict.label2class(pred_list[j][i]))
 94 |         img = adv_list[j][i]
 95 |         # print(img)
 96 |         img = np.transpose(img, (1, 2, 0))
 97 | 
 98 |         img = img.astype(np.uint8)
 99 |         axs[j, i].imshow(img)
100 | 
101 |         axs[j, i].get_xaxis().set_ticks([])
102 |         axs[j, i].get_yaxis().set_ticks([])
103 | 
104 | plt.tight_layout()
105 | plt.savefig(os.path.join(img_folder, 'cifar_large_%s_%s.jpg' % (perturbation_type, args.affix)))
106 | # plt.savefig(os.path.join(img_folder, 'test_%s.jpg' % (args.affix)))
107 | 
108 | 
109 | # types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
110 | 
111 | 
112 | # model_checkpoints = ['checkpoint/cifar-10_std/checkpoint_76000.pth',
113 | #                      'checkpoint/cifar-10_linf/checkpoint_76000.pth', 
114 | #                      'checkpoint/cifar-10_l2/checkpoint_76000.pth']
115 | 
116 | # adv_list = []
117 | # pred_list = []
118 | 
119 | # max_epsilon = 4
120 | 
121 | # perturbation_type = 'l2'
122 | 
123 | # with torch.no_grad():
124 | #     for checkpoint  in model_checkpoints:
125 | 
126 | #         model = WideResNet(depth=34, num_classes=10, widen_factor=10, dropRate=0.0)
127 | 
128 | #         load_model(model, checkpoint)
129 | 
130 | #         if torch.cuda.is_available():
131 | #             model.cuda()
132 | 
133 | #         attack = FastGradientSignUntargeted(model, 
134 | #                                             max_epsilon, 
135 | #                                             args.alpha, 
136 | #                                             min_val=0, 
137 | #                                             max_val=1, 
138 | #                                             max_iters=args.k, 
139 | #                                             _type=perturbation_type)
140 | 
141 |        
142 | #         adv_data = attack.perturb(data, label, 'mean', False)
143 | 
144 | #         output = model(adv_data, _eval=True)
145 | #         pred = torch.max(output, dim=1)[1]
146 | #         adv_list.append(adv_data.cpu().numpy().squeeze() * 255.0)  # (N, 28, 28)
147 | #         pred_list.append(pred.cpu().numpy())
148 | 
149 | # data = data.cpu().numpy().squeeze()  # (N, 28, 28)
150 | # data *= 255.0
151 | # label = label.cpu().numpy()
152 | 
153 | # adv_list.insert(0, data)
154 | 
155 | # pred_list.insert(0, label)
156 | 
157 | # out_num = 5
158 | 
159 | # fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num)
160 | 
161 | # axs = _axs
162 | 
163 | # for j, _type in enumerate(types):
164 | #     axs[j, 0].set_ylabel(_type)
165 | 
166 | #     for i in range(out_num):
167 | #         axs[j, i].set_xlabel('%s' % label_dict.label2class(pred_list[j][i]))
168 | #         img = adv_list[j][i]
169 | #         # print(img)
170 | #         img = np.transpose(img, (1, 2, 0))
171 | 
172 | #         img = img.astype(np.uint8)
173 | #         axs[j, i].imshow(img)
174 | 
175 | #         axs[j, i].get_xaxis().set_ticks([])
176 | #         axs[j, i].get_yaxis().set_ticks([])
177 | 
178 | # plt.tight_layout()
179 | # plt.savefig(os.path.join(img_folder, 'cifar_large_%s_%s.jpg' % (perturbation_type, args.affix)))


--------------------------------------------------------------------------------
/mnist/README.md:
--------------------------------------------------------------------------------
  1 | # Adversarial Training and Visualization on MNIST
  2 | 
  3 | 
  4 | ## Results
  5 | 
  6 | ### Learning Curves
  7 | 
  8 | Epsilon in linf (l2) training is 0.3 (1.5). 
  9 | 
 10 | <table border=0 width="50px" >
 11 | 	<tbody> 
 12 |     <tr>	
 13 |     	<th colspan="2" align="center"> <strong>Standard Training</strong> </th>
 14 | 		<th colspan="2" align="center"> <strong>l_inf Training</strong> </th>
 15 | 		<th colspan="2" align="center"> <strong>l_2 Training</strong></th>
 16 | 	</tr>
 17 | 	<tr>
 18 | 		<th colspan="2" align="center"> <img src="https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_learning_curve_std.jpg"> </th>
 19 | 		<th colspan="2" align="center"> <img src="https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_learning_curve_linf.jpg"> </th>
 20 | 		<th colspan="2" align="center"> <img src="https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_learning_curve_l2.jpg"> </th>
 21 | 	</tr>
 22 | 	<tr>
 23 | 		<th colspan="1" align="center"> <strong>Standard Accuracy</strong> <br/> (train/test) </th>
 24 | 		<th colspan="1" align="center"> <strong>Robustness Accuracy</strong> <br/> (train/test) </th>
 25 | 		<th colspan="1" align="center"> <strong>Standard Accuracy</strong> <br/> (train/test) </th>
 26 | 		<th colspan="1" align="center"> <strong>Robustness Accuracy</strong> <br/> (train/test) </th>
 27 | 		<th colspan="1" align="center"> <strong>Standard Accuracy</strong> <br/> (train/test) </th>
 28 | 		<th colspan="1" align="center"> <strong>Robustness Accuracy</strong> <br/> (train/test) </th>
 29 | 	</tr>
 30 | 	<tr>
 31 | 		<th colspan="1" align="center"> 100.00/99.32 </th>
 32 | 		<th colspan="1" align="center"> 0.00/0.61 </th>
 33 | 		<th colspan="1" align="center"> 100.00/98.96 </th>
 34 | 		<th colspan="1" align="center"> 96.88/95.16 </th>
 35 | 		<th colspan="1" align="center"> 100.00/99.41 </th>
 36 | 		<th colspan="1" align="center"> 100.00/97.48 </th>
 37 | 	</tr>
 38 | 	</tbody>
 39 | </table>
 40 | 
 41 | Note that in testing mode, the target label used in creating the adversarial example is the most confident prediction of the model, not the ground truth. Therefore, sometimes the testing robustness is higher than training robustness, when the prediction is wrong at first.
 42 | 
 43 | ### Visualization of Gradient with Respect to Input
 44 | 
 45 | ![visualization](https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_grad_default.jpg)
 46 | 
 47 | ### The Adversarial Example with large epsilon
 48 | 
 49 | The maximum epsilon is set to 4 (l2 norm) in this part.
 50 | 
 51 | ![large](https://github.com/louis2889184/adversarial_training/blob/master/mnist/img/mnist_large_l2_.jpg)
 52 | 
 53 | 
 54 | ## Requirements:
 55 | ```
 56 | python >= 3.5
 57 | torch == 1.0
 58 | torchvision == 0.2.1
 59 | numpy >= 1.16.1
 60 | matplotlib >= 3.0.2
 61 | ```
 62 | 
 63 | ## Execution
 64 | 
 65 | ### Training
 66 | 
 67 | Standard training: <br/>
 68 | 
 69 | ```
 70 | python main.py --data_root [data directory]
 71 | ```
 72 | 
 73 | linf training: <br/>
 74 | 
 75 | ```
 76 | python main.py --data_root [data directory] -e 0.3 -p 'linf' --adv_train
 77 | ```
 78 | 
 79 | l2 training: <br/>
 80 | 
 81 | ```
 82 | python main.py --data_root [data directory] -e 1.5 -p 'l2' --adv_train
 83 | ```
 84 | 
 85 | ### Testing
 86 | 
 87 | change the setting if you want to do linf testing.
 88 | ```
 89 | python main.py --todo test --data_root [data directory] -e 0.314 -p 'l2' --load_checkpoint [your_model.pth]
 90 | ```
 91 | 
 92 | ### Visualization
 93 | 
 94 | change the setting in `visualize.py` `visualize_attack.py` and if you want to do linf visualization.
 95 | 
 96 | visualize gradient to input: <br/>
 97 | 
 98 | ```
 99 | python visualize.py --load_checkpoint [your_model.pth]
100 | ```
101 | 
102 | visualize adversarial examples with larger epsilon <br/>
103 | 
104 | ```
105 | python visualize_attack.py --load_checkpoint [your_model.pth]
106 | ```
107 | 
108 | 
109 | ## Training Time
110 | 
111 | Standard training: 0.64 s / 100 iterations <br/>
112 | Adversarial training: 16 s / 100 iterations <br/> <br/>
113 | 
114 | where the batch size is 64 and train on NVIDIA GeForce GTX 1080.
115 | 


--------------------------------------------------------------------------------
/mnist/img/mnist_grad_default.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_grad_default.jpg


--------------------------------------------------------------------------------
/mnist/img/mnist_large_l2_.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_large_l2_.jpg


--------------------------------------------------------------------------------
/mnist/img/mnist_learning_curve_l2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_l2.jpg


--------------------------------------------------------------------------------
/mnist/img/mnist_learning_curve_linf.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_linf.jpg


--------------------------------------------------------------------------------
/mnist/img/mnist_learning_curve_std.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ylsung/pytorch-adversarial-training/1103fe300dc08f740b6870aebdd40a87d5690a45/mnist/img/mnist_learning_curve_std.jpg


--------------------------------------------------------------------------------
/mnist/main.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import torch
  3 | import torch.nn as nn
  4 | import torch.nn.functional as F
  5 | from torch.utils.data import DataLoader
  6 | 
  7 | import torchvision as tv
  8 | 
  9 | from time import time
 10 | from model import Model
 11 | from attack import FastGradientSignUntargeted
 12 | from utils import makedirs, create_logger, tensor2cuda, numpy2cuda, evaluate, save_model
 13 | 
 14 | from argument import parser, print_args
 15 | 
 16 | class Trainer():
 17 |     def __init__(self, args, logger, attack):
 18 |         self.args = args
 19 |         self.logger = logger
 20 |         self.attack = attack
 21 | 
 22 |     def standard_train(self, model, tr_loader, va_loader=None):
 23 |         self.train(model, tr_loader, va_loader, False)
 24 | 
 25 |     def adversarial_train(self, model, tr_loader, va_loader=None):
 26 |         self.train(model, tr_loader, va_loader, True)
 27 | 
 28 |     def train(self, model, tr_loader, va_loader=None, adv_train=False):
 29 |         args = self.args
 30 |         logger = self.logger
 31 | 
 32 |         opt = torch.optim.Adam(model.parameters(), args.learning_rate)
 33 | 
 34 |         _iter = 0
 35 | 
 36 |         begin_time = time()
 37 | 
 38 |         for epoch in range(1, args.max_epoch+1):
 39 |             for data, label in tr_loader:
 40 |                 data, label = tensor2cuda(data), tensor2cuda(label)
 41 | 
 42 |                 if adv_train:
 43 |                     # When training, the adversarial example is created from a random 
 44 |                     # close point to the original data point. If in evaluation mode, 
 45 |                     # just start from the original data point.
 46 |                     adv_data = self.attack.perturb(data, label, 'mean', True)
 47 |                     output = model(adv_data, _eval=False)
 48 |                 else:
 49 |                     output = model(data, _eval=False)
 50 | 
 51 |                 loss = F.cross_entropy(output, label)
 52 | 
 53 |                 opt.zero_grad()
 54 |                 loss.backward()
 55 |                 opt.step()
 56 | 
 57 |                 if _iter % args.n_eval_step == 0:
 58 | 
 59 |                     if adv_train:
 60 |                         with torch.no_grad():
 61 |                             stand_output = model(data, _eval=True)
 62 |                         pred = torch.max(stand_output, dim=1)[1]
 63 | 
 64 |                         # print(pred)
 65 |                         std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 66 | 
 67 |                         pred = torch.max(output, dim=1)[1]
 68 |                         # print(pred)
 69 |                         adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 70 | 
 71 |                     else:
 72 |                         adv_data = self.attack.perturb(data, label, 'mean', False)
 73 | 
 74 |                         with torch.no_grad():
 75 |                             adv_output = model(adv_data, _eval=True)
 76 |                         pred = torch.max(adv_output, dim=1)[1]
 77 |                         # print(label)
 78 |                         # print(pred)
 79 |                         adv_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 80 | 
 81 |                         pred = torch.max(output, dim=1)[1]
 82 |                         # print(pred)
 83 |                         std_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy()) * 100
 84 | 
 85 |                     # only calculating the training time
 86 |                     logger.info('epoch: %d, iter: %d, spent %.2f s, tr_loss: %.3f' % (
 87 |                         epoch, _iter, time() - begin_time, loss.item()))
 88 | 
 89 |                     logger.info('standard acc: %.3f %%, robustness acc: %.3f %%' % (
 90 |                         std_acc, adv_acc))
 91 | 
 92 |                     if va_loader is not None:
 93 |                         va_acc, va_adv_acc = self.test(model, va_loader, True)
 94 |                         va_acc, va_adv_acc = va_acc * 100.0, va_adv_acc * 100.0
 95 | 
 96 |                         logger.info('\n' + '='*30 + ' evaluation ' + '='*30)
 97 |                         logger.info('test acc: %.3f %%, test adv acc: %.3f %%' % (
 98 |                             va_acc, va_adv_acc))
 99 |                         logger.info('='*28 + ' end of evaluation ' + '='*28 + '\n')
100 | 
101 |                     begin_time = time()
102 | 
103 |                 if _iter % args.n_store_image_step == 0:
104 |                     tv.utils.save_image(torch.cat([data.cpu(), adv_data.cpu()], dim=0), 
105 |                                         os.path.join(args.log_folder, 'images_%d.jpg' % _iter), 
106 |                                         nrow=16)
107 |                     
108 | 
109 |                 if _iter % args.n_checkpoint_step == 0:
110 |                     file_name = os.path.join(args.model_folder, 'checkpoint_%d.pth' % _iter)
111 |                     save_model(model, file_name)
112 | 
113 |                 _iter += 1
114 | 
115 |     def test(self, model, loader, adv_test=False):
116 |         # adv_test is False, return adv_acc as -1 
117 | 
118 |         total_acc = 0.0
119 |         num = 0
120 |         total_adv_acc = 0.0
121 | 
122 |         with torch.no_grad():
123 |             for data, label in loader:
124 |                 data, label = tensor2cuda(data), tensor2cuda(label)
125 | 
126 |                 output = model(data, _eval=True)
127 | 
128 |                 pred = torch.max(output, dim=1)[1]
129 |                 te_acc = evaluate(pred.cpu().numpy(), label.cpu().numpy(), 'sum')
130 |                 
131 |                 total_acc += te_acc
132 |                 num += output.shape[0]
133 | 
134 |                 if adv_test:
135 |                     # use predicted label as target label
136 |                     # with torch.enable_grad():
137 |                     adv_data = self.attack.perturb(data, pred, 'mean', False)
138 | 
139 |                     adv_output = model(adv_data, _eval=True)
140 | 
141 |                     adv_pred = torch.max(adv_output, dim=1)[1]
142 |                     adv_acc = evaluate(adv_pred.cpu().numpy(), label.cpu().numpy(), 'sum')
143 |                     total_adv_acc += adv_acc
144 |                 else:
145 |                     total_adv_acc = -num
146 | 
147 |         return total_acc / num , total_adv_acc / num
148 | 
149 | def main(args):
150 | 
151 |     save_folder = '%s_%s' % (args.dataset, args.affix)
152 | 
153 |     log_folder = os.path.join(args.log_root, save_folder)
154 |     model_folder = os.path.join(args.model_root, save_folder)
155 | 
156 |     makedirs(log_folder)
157 |     makedirs(model_folder)
158 | 
159 |     setattr(args, 'log_folder', log_folder)
160 |     setattr(args, 'model_folder', model_folder)
161 | 
162 |     logger = create_logger(log_folder, args.todo, 'info')
163 | 
164 |     print_args(args, logger)
165 | 
166 |     model = Model(i_c=1, n_c=10)
167 | 
168 |     attack = FastGradientSignUntargeted(model, 
169 |                                         args.epsilon, 
170 |                                         args.alpha, 
171 |                                         min_val=0, 
172 |                                         max_val=1, 
173 |                                         max_iters=args.k, 
174 |                                         _type=args.perturbation_type)
175 | 
176 |     if torch.cuda.is_available():
177 |         model.cuda()
178 | 
179 |     trainer = Trainer(args, logger, attack)
180 | 
181 |     if args.todo == 'train':
182 |         tr_dataset = tv.datasets.MNIST(args.data_root, 
183 |                                        train=True, 
184 |                                        transform=tv.transforms.ToTensor(), 
185 |                                        download=True)
186 | 
187 |         tr_loader = DataLoader(tr_dataset, batch_size=args.batch_size, shuffle=True, num_workers=4)
188 | 
189 |         # evaluation during training
190 |         te_dataset = tv.datasets.MNIST(args.data_root, 
191 |                                        train=False, 
192 |                                        transform=tv.transforms.ToTensor(), 
193 |                                        download=True)
194 | 
195 |         te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
196 | 
197 |         trainer.train(model, tr_loader, te_loader, args.adv_train)
198 |     elif args.todo == 'test':
199 |         pass
200 |     else:
201 |         raise NotImplementedError
202 |     
203 | 
204 | 
205 | 
206 | if __name__ == '__main__':
207 |     args = parser()
208 | 
209 |     os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
210 | 
211 |     main(args)


--------------------------------------------------------------------------------
/mnist/src/argument.py:
--------------------------------------------------------------------------------
 1 | import argparse
 2 | 
 3 | def parser():
 4 |     parser = argparse.ArgumentParser(description='Video Summarization')
 5 |     parser.add_argument('--todo', choices=['train', 'valid', 'test', 'visualize'], default='train',
 6 |         help='what behavior want to do: train | valid | test | visualize')
 7 |     parser.add_argument('--dataset', default='mnist', help='use what dataset')
 8 |     parser.add_argument('--data_root', default='/home/yilin/Data', 
 9 |         help='the directory to save the dataset')
10 |     parser.add_argument('--log_root', default='log', 
11 |         help='the directory to save the logs or other imformations (e.g. images)')
12 |     parser.add_argument('--model_root', default='checkpoint', help='the directory to save the models')
13 |     parser.add_argument('--load_checkpoint', default='./model/default/model.pth')
14 |     parser.add_argument('--affix', default='', help='the affix for the save folder')
15 | 
16 |     # parameters for generating adversarial examples
17 |     parser.add_argument('--epsilon', '-e', type=float, default=0.3, 
18 |         help='maximum perturbation of adversaries')
19 |     parser.add_argument('--alpha', '-a', type=float, default=0.01, 
20 |         help='movement multiplier per iteration when generating adversarial examples')
21 |     parser.add_argument('--k', '-k', type=int, default=40, 
22 |         help='maximum iteration when generating adversarial examples')
23 | 
24 | 
25 | 
26 |     parser.add_argument('--batch_size', '-b', type=int, default=64, help='batch size')
27 |     parser.add_argument('--max_epoch', '-m_e', type=int, default=60, 
28 |         help='the maximum numbers of the model see a sample')
29 |     parser.add_argument('--learning_rate', '-lr', type=float, default=1e-4, help='learning rate')
30 | 
31 |     parser.add_argument('--gpu', '-g', default='0', help='which gpu to use')
32 |     parser.add_argument('--n_eval_step', type=int, default=100, 
33 |         help='number of iteration per one evaluation')
34 |     parser.add_argument('--n_checkpoint_step', type=int, default=2000, 
35 |         help='number of iteration to save a checkpoint')
36 |     parser.add_argument('--n_store_image_step', type=int, default=2000, 
37 |         help='number of iteration to save adversaries')
38 |     parser.add_argument('--perturbation_type', '-p', choices=['linf', 'l2'], default='linf', 
39 |         help='the type of the perturbation (linf or l2)')
40 |     
41 |     parser.add_argument('--adv_train', action='store_true')
42 | 
43 |     return parser.parse_args()
44 | 
45 | def print_args(args, logger=None):
46 |     for k, v in vars(args).items():
47 |         if logger is not None:
48 |             logger.info('{:<16} : {}'.format(k, v))
49 |         else:
50 |             print('{:<16} : {}'.format(k, v))


--------------------------------------------------------------------------------
/mnist/src/attack/__init__.py:
--------------------------------------------------------------------------------
1 | from .fast_gradient_sign_untargeted import FastGradientSignUntargeted


--------------------------------------------------------------------------------
/mnist/src/attack/fast_gradient_sign_untargeted.py:
--------------------------------------------------------------------------------
  1 | """
  2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks
  3 | 
  4 | original author: Utku Ozbulak - github.com/utkuozbulak
  5 | """
  6 | import sys
  7 | sys.path.append("..")
  8 | 
  9 | import os
 10 | import numpy as np
 11 | 
 12 | import torch
 13 | from torch import nn
 14 | import torch.nn.functional as F
 15 | 
 16 | from utils import tensor2cuda
 17 | 
 18 | def project(x, original_x, epsilon, _type='linf'):
 19 | 
 20 |     if _type == 'linf':
 21 |         max_x = original_x + epsilon
 22 |         min_x = original_x - epsilon
 23 | 
 24 |         x = torch.max(torch.min(x, max_x), min_x)
 25 | 
 26 |     elif _type == 'l2':
 27 |         dist = (x - original_x)
 28 | 
 29 |         dist = dist.view(x.shape[0], -1)
 30 | 
 31 |         dist_norm = torch.norm(dist, dim=1, keepdim=True)
 32 | 
 33 |         mask = (dist_norm > epsilon).unsqueeze(2).unsqueeze(3)
 34 | 
 35 |         # dist = F.normalize(dist, p=2, dim=1)
 36 | 
 37 |         dist = dist / dist_norm
 38 | 
 39 |         dist *= epsilon
 40 | 
 41 |         dist = dist.view(x.shape)
 42 | 
 43 |         x = (original_x + dist) * mask.float() + x * (1 - mask.float())
 44 | 
 45 |     else:
 46 |         raise NotImplementedError
 47 | 
 48 |     return x
 49 | 
 50 | class FastGradientSignUntargeted():
 51 |     b"""
 52 |         Fast gradient sign untargeted adversarial attack, minimizes the initial class activation
 53 |         with iterative grad sign updates
 54 |     """
 55 |     def __init__(self, model, epsilon, alpha, min_val, max_val, max_iters, _type='linf'):
 56 |         self.model = model
 57 |         # self.model.eval()
 58 | 
 59 |         # Maximum perturbation
 60 |         self.epsilon = epsilon
 61 |         # Movement multiplier per iteration
 62 |         self.alpha = alpha
 63 |         # Minimum value of the pixels
 64 |         self.min_val = min_val
 65 |         # Maximum value of the pixels
 66 |         self.max_val = max_val
 67 |         # Maximum numbers of iteration to generated adversaries
 68 |         self.max_iters = max_iters
 69 |         # The perturbation of epsilon
 70 |         self._type = _type
 71 |         
 72 |     def perturb(self, original_images, labels, reduction4loss='mean', random_start=False):
 73 |         # original_images: values are within self.min_val and self.max_val
 74 | 
 75 |         # The adversaries created from random close points to the original data
 76 |         if random_start:
 77 |             rand_perturb = torch.FloatTensor(original_images.shape).uniform_(
 78 |                 -self.epsilon, self.epsilon)
 79 |             rand_perturb = tensor2cuda(rand_perturb)
 80 |             x = original_images + rand_perturb
 81 |             x.clamp_(self.min_val, self.max_val)
 82 |         else:
 83 |             x = original_images.clone()
 84 | 
 85 |         x.requires_grad = True 
 86 | 
 87 |         # max_x = original_images + self.epsilon
 88 |         # min_x = original_images - self.epsilon
 89 | 
 90 |         with torch.enable_grad():
 91 |             for _iter in range(self.max_iters):
 92 |                 outputs = self.model(x, _eval=True)
 93 | 
 94 |                 loss = F.cross_entropy(outputs, labels, reduction=reduction4loss)
 95 | 
 96 |                 if reduction4loss == 'none':
 97 |                     grad_outputs = tensor2cuda(torch.ones(loss.shape))
 98 |                     
 99 |                 else:
100 |                     grad_outputs = None
101 | 
102 |                 grads = torch.autograd.grad(loss, x, grad_outputs=grad_outputs, 
103 |                         only_inputs=True)[0]
104 | 
105 |                 x.data += self.alpha * torch.sign(grads.data) 
106 | 
107 |                 # the adversaries' pixel value should within max_x and min_x due 
108 |                 # to the l_infinity / l2 restriction
109 |                 x = project(x, original_images, self.epsilon, self._type)
110 |                 # the adversaries' value should be valid pixel value
111 |                 x.clamp_(self.min_val, self.max_val)
112 | 
113 |         return x
114 | 


--------------------------------------------------------------------------------
/mnist/src/model/__init__.py:
--------------------------------------------------------------------------------
1 | from .model import *


--------------------------------------------------------------------------------
/mnist/src/model/model.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | 
 4 | class Expression(nn.Module):
 5 |     def __init__(self, func):
 6 |         super(Expression, self).__init__()
 7 |         self.func = func
 8 |     
 9 |     def forward(self, input):
10 |         return self.func(input)
11 | 
12 | class Model(nn.Module):
13 |     def __init__(self, i_c=1, n_c=10):
14 |         super(Model, self).__init__()
15 | 
16 |         self.conv1 = nn.Conv2d(i_c, 32, 5, stride=1, padding=2, bias=True)
17 |         self.pool1 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
18 | 
19 |         self.conv2 = nn.Conv2d(32, 64, 5, stride=1, padding=2, bias=True)
20 |         self.pool2 = nn.MaxPool2d((2, 2), stride=(2, 2), padding=0)
21 | 
22 | 
23 |         self.flatten = Expression(lambda tensor: tensor.view(tensor.shape[0], -1))
24 |         self.fc1 = nn.Linear(7 * 7 * 64, 1024, bias=True)
25 |         self.fc2 = nn.Linear(1024, n_c)
26 | 
27 | 
28 |     def forward(self, x_i, _eval=False):
29 | 
30 |         if _eval:
31 |             # switch to eval mode
32 |             self.eval()
33 |         else:
34 |             self.train()
35 |             
36 |         x_o = self.conv1(x_i)
37 |         x_o = torch.relu(x_o)
38 |         x_o = self.pool1(x_o)
39 | 
40 |         x_o = self.conv2(x_o)
41 |         x_o = torch.relu(x_o)
42 |         x_o = self.pool2(x_o)
43 | 
44 |         x_o = self.flatten(x_o)
45 | 
46 |         x_o = torch.relu(self.fc1(x_o))
47 | 
48 |         self.train()
49 | 
50 |         return self.fc2(x_o)
51 | 
52 | 
53 | if __name__ == '__main__':
54 |     i = torch.FloatTensor(4, 1, 28, 28)
55 | 
56 |     n = Model()
57 | 
58 |     print(n(i).size())
59 | 
60 | 


--------------------------------------------------------------------------------
/mnist/src/read_log.py:
--------------------------------------------------------------------------------
 1 | import re
 2 | import os
 3 | from utils import makedirs
 4 | import matplotlib.pyplot as plt
 5 | 
 6 | file_name = '../log/mnist_l2_adv/train_log.txt'
 7 | affix = 'l2'
 8 | title = r'$l_2$ Training'
 9 | 
10 | img_folder = '../img'
11 | makedirs(img_folder)
12 | 
13 | train_iter_list = []
14 | train_acc_list = []
15 | train_rob_list = []
16 | 
17 | test_iter_list = []
18 | test_acc_list = []
19 | test_rob_list = []
20 | 
21 | with open(file_name, 'r') as f:
22 |     lines = f.readlines()
23 | 
24 |     for line in lines:
25 |         splits = re.split('[, =%:\n]+', line)
26 | 
27 |         if splits[0] == 'epoch':
28 |             _iter = int(splits[3])
29 |             train_iter_list.append(_iter)
30 | 
31 |         if splits[0] == 'standard':
32 |             train_acc_list.append(float(splits[2]))
33 |             train_rob_list.append(float(splits[5]))
34 | 
35 |         if splits[0] == 'test':
36 |             test_iter_list.append(_iter)
37 |             test_acc_list.append(float(splits[2]))
38 |             test_rob_list.append(float(splits[6]))
39 | 
40 | 
41 | a_1 = plt.plot(train_iter_list, train_acc_list , color='r', label='train standard accuary')[0]
42 | a_2 = plt.plot(test_iter_list, test_acc_list , color='r', linestyle='--', label='test standard accuary')[0]
43 | 
44 | b_1 = plt.plot(train_iter_list, train_rob_list , color='b', label='train robust accuary')[0]
45 | b_2 = plt.plot(test_iter_list, test_rob_list , color='b', linestyle='--', label='test robust accuary')[0]
46 | 
47 | plt.title(title)
48 | 
49 | plt.legend(handles=[a_1, a_2, b_1, b_2])
50 | 
51 | plt.savefig(os.path.join(img_folder, 'mnist_learning_curve_%s.jpg' % affix))


--------------------------------------------------------------------------------
/mnist/src/utils/__init__.py:
--------------------------------------------------------------------------------
1 | from .utils import *


--------------------------------------------------------------------------------
/mnist/src/utils/utils.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | import json
 3 | import logging
 4 | 
 5 | import numpy as np
 6 | 
 7 | import torch
 8 | 
 9 | 
10 | def list2cuda(_list):
11 |     array = np.array(_list)
12 |     return numpy2cuda(array)
13 | 
14 | def numpy2cuda(array):
15 |     tensor = torch.from_numpy(array)
16 | 
17 |     return tensor2cuda(tensor)
18 | 
19 | def tensor2cuda(tensor):
20 |     if torch.cuda.is_available():
21 |         tensor = tensor.cuda()
22 | 
23 |     return tensor
24 | 
25 | def one_hot(ids, n_class):
26 |     # --------------------- 
27 |     # author：ke1th 
28 |     # source：CSDN 
29 |     # artical：https://blog.csdn.net/u012436149/article/details/77017832 
30 |     b"""
31 |     ids: (list, ndarray) shape:[batch_size]
32 |     out_tensor:FloatTensor shape:[batch_size, depth]
33 |     """
34 | 
35 |     assert len(ids.shape) == 1, 'the ids should be 1-D'
36 |     # ids = torch.LongTensor(ids).view(-1,1) 
37 | 
38 |     out_tensor = torch.zeros(len(ids), n_class)
39 | 
40 |     out_tensor.scatter_(1, ids.cpu().unsqueeze(1), 1.)
41 | 
42 |     return out_tensor
43 |     
44 | def evaluate(_input, _target, method='mean'):
45 |     correct = (_input == _target).astype(np.float32)
46 |     if method == 'mean':
47 |         return correct.mean()
48 |     else:
49 |         return correct.sum()
50 | 
51 | 
52 | def create_logger(save_path='', file_type='', level='debug'):
53 | 
54 |     if level == 'debug':
55 |         _level = logging.DEBUG
56 |     elif level == 'info':
57 |         _level = logging.INFO
58 | 
59 |     logger = logging.getLogger()
60 |     logger.setLevel(_level)
61 | 
62 |     cs = logging.StreamHandler()
63 |     cs.setLevel(_level)
64 |     logger.addHandler(cs)
65 | 
66 |     if save_path != '':
67 |         file_name = os.path.join(save_path, file_type + '_log.txt')
68 |         fh = logging.FileHandler(file_name, mode='w')
69 |         fh.setLevel(_level)
70 | 
71 |         logger.addHandler(fh)
72 | 
73 |     return logger
74 | 
75 | def makedirs(path):
76 |     if not os.path.exists(path):
77 |         os.makedirs(path)
78 | 
79 | def load_model(model, file_name):
80 |     model.load_state_dict(
81 |             torch.load(file_name, map_location=lambda storage, loc: storage))
82 | 
83 | def save_model(model, file_name):
84 |     torch.save(model.state_dict(), file_name)


--------------------------------------------------------------------------------
/mnist/src/visualization/__init__.py:
--------------------------------------------------------------------------------
1 | from .vanilla_backprop import VanillaBackprop


--------------------------------------------------------------------------------
/mnist/src/visualization/vanilla_backprop.py:
--------------------------------------------------------------------------------
 1 | """
 2 | this code is modified from https://github.com/utkuozbulak/pytorch-cnn-adversarial-attacks
 3 | 
 4 | original author: Utku Ozbulak - github.com/utkuozbulak
 5 | """
 6 | 
 7 | import sys
 8 | sys.path.append("..")
 9 | 
10 | import torch
11 | 
12 | from utils import tensor2cuda, one_hot
13 | 
14 | class VanillaBackprop():
15 |     """
16 |         Produces gradients generated with vanilla back propagation from the image
17 |     """
18 |     def __init__(self, model):
19 |         self.model = model
20 | 
21 |     def generate_gradients(self, input_image, target_class):
22 |         # Put model in evaluation mode
23 |         self.model.eval()
24 | 
25 |         x = input_image.clone()
26 | 
27 |         x.requires_grad = True
28 | 
29 |         # Forward
30 |         model_output = self.model(x)
31 |         # Zero grads
32 |         self.model.zero_grad()
33 |         
34 |         grad_outputs = one_hot(target_class, model_output.shape[1])
35 |         grad_outputs = tensor2cuda(grad_outputs)
36 | 
37 |         grad = torch.autograd.grad(model_output, x, grad_outputs=grad_outputs, 
38 |                     only_inputs=True)[0]
39 | 
40 |         self.model.train()
41 | 
42 |         return grad
43 | 


--------------------------------------------------------------------------------
/mnist/visualize.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import torch
  4 | import torchvision as tv
  5 | import numpy as np
  6 | 
  7 | from torch.utils.data import DataLoader
  8 | 
  9 | from utils import makedirs, tensor2cuda, load_model
 10 | from argument import parser
 11 | from visualization import VanillaBackprop
 12 | from model import Model
 13 | 
 14 | import matplotlib.pyplot as plt 
 15 | 
 16 | img_folder = '../img'
 17 | makedirs(img_folder)
 18 | 
 19 | args = parser()
 20 | 
 21 | 
 22 | te_dataset = tv.datasets.MNIST(args.data_root, 
 23 |                                train=False, 
 24 |                                transform=tv.transforms.ToTensor(), 
 25 |                                download=True)
 26 | 
 27 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
 28 | 
 29 | 
 30 | for data, label in te_loader:
 31 | 
 32 |     data, label = tensor2cuda(data), tensor2cuda(label)
 33 | 
 34 | 
 35 |     break
 36 | 
 37 | types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
 38 | 
 39 | 
 40 | model_checkpoints = ['../checkpoint/mnist_std_train/checkpoint_56000.pth',
 41 |                      '../checkpoint/mnist_adv_train/checkpoint_56000.pth', 
 42 |                      '../checkpoint/mnist_l2_adv/checkpoint_56000.pth']
 43 | 
 44 | 
 45 | out_list = []
 46 | 
 47 | for checkpoint in model_checkpoints:
 48 | 
 49 |     model = Model(i_c=1, n_c=10)
 50 | 
 51 |     load_model(model, checkpoint)
 52 | 
 53 |     if torch.cuda.is_available():
 54 |         model.cuda()
 55 | 
 56 |     VBP = VanillaBackprop(model)
 57 | 
 58 |     grad = VBP.generate_gradients(data, label)
 59 | 
 60 |     grad_flat = grad.view(grad.shape[0], -1)
 61 |     mean = grad_flat.mean(1, keepdim=True).unsqueeze(2).unsqueeze(3)
 62 |     std = grad_flat.std(1, keepdim=True).unsqueeze(2).unsqueeze(3)
 63 | 
 64 |     mean = mean.repeat(1, 1, data.shape[2], data.shape[3])
 65 |     std = std.repeat(1, 1, data.shape[2], data.shape[3])
 66 | 
 67 |     grad = torch.max(torch.min(grad, mean+3*std), mean-3*std)
 68 | 
 69 |     print(grad.min(), grad.max())
 70 | 
 71 |     grad -= grad.min()
 72 | 
 73 |     grad /= grad.max()
 74 | 
 75 |     grad = grad.cpu().numpy().squeeze()  # (N, 28, 28)
 76 | 
 77 |     grad *= 255.0
 78 | 
 79 |     out_list.append(grad)
 80 | 
 81 | data = data.cpu().numpy().squeeze()  # (N, 28, 28)
 82 | data *= 255.0
 83 | label = label.cpu().numpy()
 84 | 
 85 | out_list.insert(0, data)
 86 | 
 87 | # normalize the grad
 88 | # length = torch.norm(grad, dim=3)
 89 | # length = torch.norm(length, dim=2)
 90 | # length = length.unsqueeze(2).unsqueeze(2)
 91 | # grad /= (length + 1e-5)
 92 | 
 93 | out_num = 5
 94 | 
 95 | fig, _axs = plt.subplots(nrows=len(out_list), ncols=out_num)
 96 | 
 97 | axs = _axs
 98 | 
 99 | 
100 | for j, _type in enumerate(types):
101 |     axs[j, 0].set_ylabel(_type)
102 | 
103 |     if j == 0:
104 |         cmap = 'gray'
105 |     else:
106 |         cmap = 'seismic'
107 | 
108 |     for i in range(out_num):
109 |         axs[j, i].set_xlabel('%d' % label[i])
110 |         axs[j, i].imshow(out_list[j][i], cmap=cmap)
111 | 
112 |         axs[j, i].get_xaxis().set_ticks([])
113 |         axs[j, i].get_yaxis().set_ticks([])
114 | 
115 | plt.tight_layout()
116 | plt.savefig(os.path.join(img_folder, 'mnist_grad_%s.jpg' % args.affix))


--------------------------------------------------------------------------------
/mnist/visualize_attack.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import os
  3 | import torch
  4 | import torchvision as tv
  5 | import numpy as np
  6 | 
  7 | from torch.utils.data import DataLoader
  8 | 
  9 | from utils import makedirs, tensor2cuda, load_model
 10 | from argument import parser
 11 | from visualization import VanillaBackprop
 12 | from attack import FastGradientSignUntargeted
 13 | from model import Model
 14 | 
 15 | import matplotlib.pyplot as plt 
 16 | 
 17 | img_folder = '../img'
 18 | makedirs(img_folder)
 19 | 
 20 | args = parser()
 21 | 
 22 | 
 23 | te_dataset = tv.datasets.MNIST(args.data_root, 
 24 |                                train=False, 
 25 |                                transform=tv.transforms.ToTensor(), 
 26 |                                download=True)
 27 | 
 28 | te_loader = DataLoader(te_dataset, batch_size=args.batch_size, shuffle=False, num_workers=4)
 29 | 
 30 | 
 31 | for data, label in te_loader:
 32 | 
 33 |     data, label = tensor2cuda(data), tensor2cuda(label)
 34 | 
 35 | 
 36 |     break
 37 | 
 38 | types = ['Original', 'Standard', r'$l_{\infty}$-trained', r'$l_2$-trained']
 39 | 
 40 | 
 41 | model_checkpoints = ['../checkpoint/mnist_std_train/checkpoint_56000.pth',
 42 |                      '../checkpoint/mnist_adv_train/checkpoint_56000.pth', 
 43 |                      '../checkpoint/mnist_l2_adv/checkpoint_56000.pth']
 44 | 
 45 | adv_list = []
 46 | pred_list = []
 47 | 
 48 | max_epsilon = 0.8
 49 | 
 50 | perturbation_type = 'linf'
 51 | 
 52 | with torch.no_grad():
 53 |     for checkpoint  in model_checkpoints:
 54 | 
 55 |         model = Model(i_c=1, n_c=10)
 56 | 
 57 |         load_model(model, checkpoint)
 58 | 
 59 |         if torch.cuda.is_available():
 60 |             model.cuda()
 61 | 
 62 |         attack = FastGradientSignUntargeted(model, 
 63 |                                             max_epsilon, 
 64 |                                             args.alpha, 
 65 |                                             min_val=0, 
 66 |                                             max_val=1, 
 67 |                                             max_iters=args.k, 
 68 |                                             _type=perturbation_type)
 69 | 
 70 |        
 71 |         adv_data = attack.perturb(data, label, 'mean', False)
 72 | 
 73 |         output = model(adv_data, _eval=True)
 74 |         pred = torch.max(output, dim=1)[1]
 75 |         adv_list.append(adv_data.cpu().numpy().squeeze())  # (N, 28, 28)
 76 |         pred_list.append(pred.cpu().numpy())
 77 | 
 78 | data = data.cpu().numpy().squeeze()  # (N, 28, 28)
 79 | data *= 255.0
 80 | label = label.cpu().numpy()
 81 | 
 82 | adv_list.insert(0, data)
 83 | 
 84 | pred_list.insert(0, label)
 85 | 
 86 | out_num = 5
 87 | 
 88 | fig, _axs = plt.subplots(nrows=len(adv_list), ncols=out_num)
 89 | 
 90 | axs = _axs
 91 | 
 92 | cmap = 'gray'
 93 | for j, _type in enumerate(types):
 94 |     axs[j, 0].set_ylabel(_type)
 95 | 
 96 |     for i in range(out_num):
 97 |         axs[j, i].set_xlabel('%d' % pred_list[j][i])
 98 |         axs[j, i].imshow(adv_list[j][i], cmap=cmap)
 99 | 
100 |         axs[j, i].get_xaxis().set_ticks([])
101 |         axs[j, i].get_yaxis().set_ticks([])
102 | 
103 | plt.tight_layout()
104 | plt.savefig(os.path.join(img_folder, 'mnist_large_%s_%s.jpg' % (perturbation_type, args.affix)))


--------------------------------------------------------------------------------
Objective Function
Standard Training	Adversarial Training
$\min \textrm{E}_{(x, y) \in Dataset}[L(x, y; \theta))]$	$\min \textrm{E}_{(x, y) \in Dataset}[\max_{{\left \\| \delta \right \\|}_p < \epsilon} L(x+\delta, y; \theta))]$
Standard Training		l_inf Training		l_2 Training

Standard Accuracy (train/test)	Robustness Accuracy (train/test)	Standard Accuracy (train/test)	Robustness Accuracy (train/test)	Standard Accuracy (train/test)	Robustness Accuracy (train/test)
92.19/87.14	0.00/7.85	79.69/78.09	61.72/63.8	89.84/85.39	76.56/77.76
Madry's Model Standard Accuracy (train/test)	Madry's Model Robustness Accuracy (train/test)	Madry's Model Standard Accuracy (train/test)	Madry's Model Robustness Accuracy (train/test)	Madry's Model Standard Accuracy (train/test)	Madry's Model Robustness Accuracy (train/test)
-	-	-/79.22	-/55.97	-/85.81	-/71.87