├── LICENSE
├── README.md
├── cifar10_data
├── batches.meta
├── data_batch_1
├── data_batch_2
├── data_batch_3
├── data_batch_4
├── data_batch_5
├── readme.html
└── test_batch
├── cifar10_input.py
├── cifar10_model.py
├── cleverhans_models.py
├── config.json
├── config_cifar10.json
├── eval.py
├── eval_ch.py
├── eval_fb.py
├── model.py
├── pgd_attack.py
├── requirements.txt
├── scripts
├── __init__.py
├── eval_cifar_lps.py
├── eval_cifar_spatial.py
├── eval_mnist_lps.py
├── eval_mnist_spatial.py
└── utils.py
└── train.py
/LICENSE:
--------------------------------------------------------------------------------
1 | MIT License
2 |
3 | Copyright (c) 2018 ftramer
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Adversarial Training and Robustness for Multiple Perturbations
2 |
3 | Code for the paper:
4 |
5 | **Adversarial Training and Robustness for Multiple Perturbations**
6 | *Florian Tramèr and Dan Boneh*
7 | Conference on Neural Information Processing Systems (NeurIPS), 2019
8 | https://arxiv.org/abs/1904.13000
9 |
10 | Our work studies the scalability and effectiveness of adversarial training for achieving robustness against a combination of multiple types of adversarial examples.
11 | We currently implement multiple Lp-bounded attacks (L1, L2, Linf) as well as rotation-translation attacks, for both MNIST and CIFAR10.
12 |
13 | Before training a model, edit the `config.json` file to specify the training, attack, and evaluation parameters. The given `config.json` file can be used as a basis for MNIST experiments, while the `config_cifar10.json` file has the apropriate hyperparameters for CIFAR10.
14 |
15 | ## Training
16 |
17 | To train, simply run:
18 |
19 | ```[bash]
20 | python train.py output/dir/
21 | ```
22 | This will read the `config.json` file from the current directory, and save the trained model, logs, as well as the original config file into `output/dir/`.
23 |
24 | ## Evaluation
25 |
26 | We performed a fairly thorough evaluation of the models we trained using a wide range of attacks. Unfortunately, there is currently no single library implementing all these attacks so we combined different ones. Some attacks we implemented ourselves (different forms of PGD and rotation-translation), others are taken from [Cleverhans](https://github.com/tensorflow/cleverhans) and from [Foolbox](https://github.com/bethgelab/foolbox).
27 | Our [evaluation scripts](scripts/) can give you an idea of how we evaluate a model against all attacks.
28 |
29 | ## Config options
30 |
31 | Many hyperparameters in the `config.json` file are standard and self-explanatory.
32 | Specific to our work are the following parameters you may consider tuning:
33 |
34 | * `"multi_attack_mode"`: When training against multiple attacks, this flag indicates whether to train against examples from all attacks (default), or only on the worst example for each input (`"MAX"`). For the wide ResNet model on CIFAR10, the default option causes memory overflow due to too large batches. The `"HALF_BATCH_HALF_LR"` flag halves the batch size (and the learning rate accordingly) to avoid overflows.
35 |
36 | * `"attacks"`: This list specifies the attacks used for either training or evaluation (or both). The parameters are standard, except for our new L1 attack. This comes with a `"perc"` parameter that specifies the sparsity of the gradient updates (see the paper for detail), and a step-size multiplier (`"a"`). The value of the `"perc"` parameter can be a range (e.g., `[80, 99]`) in which case the sparsity of each gradient update in an attack is sampled uniformly from that range. Each attack can take a `"reps"` parameter (default: 1) that specifies the number of times an attack should be repeated.
37 |
38 | * `"train_attacks"` and `"eval_attacks"`: Specify which of the attacks defined under `"attacks"` should be used for training or evaluation. These are lists of indices into `"attacks"`. I.e., `"train_attacks": [0, 1, 2]` means that the first 3 defined attacks are used for training.
39 | Our paper also defines a new type of *affine attack* that interpolates between two attack types. You can specify an affine attack via a tuple of attacks: e.g., `"eval_attacks": [0, [1, 2]]` will evaluate against the first attack, and against an affine attack that interpolates between the second and third attack. The weighting used by the affine attack can be specified by adding a `"weight"` parameter to the attack parameters.
40 |
41 | ## Acknowledgments
42 | Parts of the codebase are inspired or directly borrowed from:
43 | * https://github.com/MadryLab/cifar10_challenge
44 | * https://github.com/MadryLab/adversarial_spatial
45 |
46 |
47 | ## Citation
48 |
49 | If our code or our results are useful in your reasearch, please consider citing:
50 |
51 | ```[bibtex]
52 | @inproceedings{TB19,
53 | author={Tram{\`e}r, Florian and Boneh, Dan},
54 | title={Adversarial Training and Robustness for Multiple Perturbations},
55 | booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
56 | year={2019},
57 | howpublished={arXiv preprint arXiv:1904.13000},
58 | url={https://arxiv.org/abs/1904.13000}
59 | }
60 | ```
61 |
62 |
--------------------------------------------------------------------------------
/cifar10_data/batches.meta:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/batches.meta
--------------------------------------------------------------------------------
/cifar10_data/data_batch_1:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/data_batch_1
--------------------------------------------------------------------------------
/cifar10_data/data_batch_2:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/data_batch_2
--------------------------------------------------------------------------------
/cifar10_data/data_batch_3:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/data_batch_3
--------------------------------------------------------------------------------
/cifar10_data/data_batch_4:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/data_batch_4
--------------------------------------------------------------------------------
/cifar10_data/data_batch_5:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/data_batch_5
--------------------------------------------------------------------------------
/cifar10_data/readme.html:
--------------------------------------------------------------------------------
1 |
2 |
--------------------------------------------------------------------------------
/cifar10_data/test_batch:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/cifar10_data/test_batch
--------------------------------------------------------------------------------
/cifar10_input.py:
--------------------------------------------------------------------------------
1 | """
2 | Utilities for importing the CIFAR10 dataset.
3 |
4 | Each image in the dataset is a numpy array of shape (32, 32, 3), with the values
5 | being unsigned integers (i.e., in the range 0,1,...,255).
6 | """
7 |
8 | from __future__ import absolute_import
9 | from __future__ import division
10 | from __future__ import print_function
11 |
12 | import os
13 | import pickle
14 | import random
15 | import sys
16 | import tensorflow as tf
17 | version = sys.version_info
18 |
19 | import numpy as np
20 |
21 | class CIFAR10Data(object):
22 | """
23 | Unpickles the CIFAR10 dataset from a specified folder containing a pickled
24 | version following the format of Krizhevsky which can be found
25 | [here](https://www.cs.toronto.edu/~kriz/cifar.html).
26 |
27 | Inputs to constructor
28 | =====================
29 |
30 | - path: path to the pickled dataset. The training data must be pickled
31 | into five files named data_batch_i for i = 1, ..., 5, containing 10,000
32 | examples each, the test data
33 | must be pickled into a single file called test_batch containing 10,000
34 | examples, and the 10 class names must be
35 | pickled into a file called batches.meta. The pickled examples should
36 | be stored as a tuple of two objects: an array of 10,000 32x32x3-shaped
37 | arrays, and an array of their 10,000 true labels.
38 |
39 | """
40 | def __init__(self, path):
41 | train_filenames = ['data_batch_{}'.format(ii + 1) for ii in range(5)]
42 | eval_filename = 'test_batch'
43 | metadata_filename = 'batches.meta'
44 |
45 | train_images = np.zeros((50000, 32, 32, 3), dtype='uint8')
46 | train_labels = np.zeros(50000, dtype='int32')
47 | for ii, fname in enumerate(train_filenames):
48 | cur_images, cur_labels = self._load_datafile(
49 | os.path.join(path, fname))
50 | train_images[ii * 10000 : (ii+1) * 10000, ...] = cur_images
51 | train_labels[ii * 10000 : (ii+1) * 10000, ...] = cur_labels
52 | eval_images, eval_labels = self._load_datafile(
53 | os.path.join(path, eval_filename))
54 |
55 | with open(os.path.join(path, metadata_filename), 'rb') as fo:
56 | if version.major == 3:
57 | data_dict = pickle.load(fo, encoding='bytes')
58 | else:
59 | data_dict = pickle.load(fo)
60 |
61 | self.label_names = data_dict[b'label_names']
62 | for ii in range(len(self.label_names)):
63 | self.label_names[ii] = self.label_names[ii].decode('utf-8')
64 |
65 | self.train_data = Dataset(train_images, train_labels)
66 | self.eval_data = Dataset(eval_images, eval_labels)
67 |
68 | @staticmethod
69 | def _load_datafile(filename):
70 | with open(filename, 'rb') as fo:
71 | if version.major == 3:
72 | data_dict = pickle.load(fo, encoding='bytes')
73 | else:
74 | data_dict = pickle.load(fo)
75 |
76 | assert data_dict[b'data'].dtype == np.uint8
77 | image_data = data_dict[b'data']
78 | image_data = image_data.reshape((10000, 3, 32, 32)).transpose(0,2,3,1)
79 | return image_data, np.array(data_dict[b'labels'])
80 |
81 | class AugmentedCIFAR10Data(object):
82 | """
83 | Data augmentation wrapper over a loaded dataset.
84 |
85 | Inputs to constructor
86 | =====================
87 | - raw_cifar10data: the loaded CIFAR10 dataset, via the CIFAR10Data class
88 | - sess: current tensorflow session
89 | """
90 | def __init__(self, raw_cifar10data, sess):
91 | assert isinstance(raw_cifar10data, CIFAR10Data)
92 | self.image_size = 32
93 |
94 | # create augmentation computational graph
95 | self.x_input_placeholder = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
96 | padded = tf.map_fn(lambda img: tf.image.resize_image_with_crop_or_pad(
97 | img, self.image_size + 4, self.image_size + 4),
98 | self.x_input_placeholder)
99 | cropped = tf.map_fn(lambda img: tf.random_crop(img, [self.image_size,
100 | self.image_size,
101 | 3]), padded)
102 | flipped = tf.map_fn(lambda img: tf.image.random_flip_left_right(img), cropped)
103 | self.augmented = flipped
104 |
105 | self.train_data = AugmentedDataset(raw_cifar10data.train_data, sess,
106 | self.x_input_placeholder,
107 | self.augmented)
108 | self.eval_data = AugmentedDataset(raw_cifar10data.eval_data, sess,
109 | self.x_input_placeholder,
110 | self.augmented)
111 | self.label_names = raw_cifar10data.label_names
112 |
113 |
114 | class Dataset(object):
115 | """
116 | Dataset object implementing a simple batching procedure.
117 | """
118 | def __init__(self, xs, ys):
119 | self.xs = xs
120 | self.n = xs.shape[0]
121 | self.ys = ys
122 | self.batch_start = 0
123 | self.cur_order = np.random.permutation(self.n)
124 |
125 | def get_next_batch(self, batch_size, multiple_passes=False,
126 | reshuffle_after_pass=True):
127 | if self.n < batch_size:
128 | raise ValueError('Batch size can be at most the dataset size')
129 | if not multiple_passes:
130 | actual_batch_size = min(batch_size, self.n - self.batch_start)
131 | if actual_batch_size <= 0:
132 | raise ValueError('Pass through the dataset is complete.')
133 | batch_end = self.batch_start + actual_batch_size
134 | batch_xs = self.xs[self.cur_order[self.batch_start : batch_end],...]
135 | batch_ys = self.ys[self.cur_order[self.batch_start : batch_end],...]
136 | self.batch_start += actual_batch_size
137 | return batch_xs, batch_ys
138 | actual_batch_size = min(batch_size, self.n - self.batch_start)
139 | if actual_batch_size < batch_size:
140 | if reshuffle_after_pass:
141 | self.cur_order = np.random.permutation(self.n)
142 | self.batch_start = 0
143 | batch_end = self.batch_start + batch_size
144 | batch_xs = self.xs[self.cur_order[self.batch_start : batch_end], ...]
145 | batch_ys = self.ys[self.cur_order[self.batch_start : batch_end], ...]
146 | self.batch_start += actual_batch_size
147 | return batch_xs, batch_ys
148 |
149 |
150 | class AugmentedDataset(object):
151 | """
152 | Dataset object with built-in data augmentation. When performing
153 | adversarial attacks, we cannot include data augmentation as part of the
154 | model. If we do the adversary will try to backprop through it.
155 | """
156 | def __init__(self, raw_datasubset, sess, x_input_placeholder,
157 | augmented):
158 | self.sess = sess
159 | self.raw_datasubset = raw_datasubset
160 | self.x_input_placeholder = x_input_placeholder
161 | self.augmented = augmented
162 |
163 | def get_next_batch(self, batch_size, multiple_passes=False,
164 | reshuffle_after_pass=True):
165 | raw_batch = self.raw_datasubset.get_next_batch(batch_size,
166 | multiple_passes,
167 | reshuffle_after_pass)
168 | images = raw_batch[0].astype(np.float32)
169 |
170 | # return both the raw and augmented input
171 | # for adversarial training with rotation/translations, we start
172 | # from the raw input to avoid compounding augmentations
173 | return (raw_batch[0],
174 | self.sess.run(
175 | self.augmented,
176 | feed_dict={self.x_input_placeholder: raw_batch[0]}),
177 | raw_batch[1])
178 |
179 |
--------------------------------------------------------------------------------
/cifar10_model.py:
--------------------------------------------------------------------------------
1 | # based on https://github.com/tensorflow/models/tree/master/resnet
2 | from __future__ import absolute_import
3 | from __future__ import division
4 | from __future__ import print_function
5 |
6 | import numpy as np
7 | import tensorflow as tf
8 |
9 | class Model(object):
10 | """ResNet model."""
11 |
12 | def __init__(self, config):
13 | """ResNet constructor.
14 | """
15 | self._build_model(config)
16 |
17 | def add_internal_summaries(self):
18 | pass
19 |
20 | def _stride_arr(self, stride):
21 | """Map a stride scalar to the stride array for tf.nn.conv2d."""
22 | return [1, stride, stride, 1]
23 |
24 | def _build_model(self, config, pad_mode='CONSTANT', pad_size=32):
25 | """Build the core model within the graph."""
26 | with tf.variable_scope('input'):
27 | filters = config['filters']
28 |
29 | self.x_input = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
30 | self.y_input = tf.placeholder(tf.int64, shape=None)
31 |
32 | self.transform = tf.placeholder_with_default(tf.zeros((tf.shape(self.x_input)[0], 3)), shape=[None, 3])
33 | trans_x, trans_y, rot = tf.unstack(self.transform, axis=1)
34 | rot *= np.pi / 180 # convert degrees to radians
35 |
36 | self.is_training = tf.placeholder_with_default(False, [])
37 |
38 | x = self.x_input
39 | x = tf.pad(x, [[0,0], [16,16], [16,16], [0,0]], pad_mode)
40 | #rotate and translate image
41 | ones = tf.ones(shape=tf.shape(trans_x))
42 | zeros = tf.zeros(shape=tf.shape(trans_x))
43 | trans = tf.stack([ones, zeros, -trans_x,
44 | zeros, ones, -trans_y,
45 | zeros, zeros], axis=1)
46 | x = tf.contrib.image.rotate(x, rot, interpolation='BILINEAR')
47 | x = tf.contrib.image.transform(x, trans, interpolation='BILINEAR')
48 | x = tf.image.resize_image_with_crop_or_pad(x, pad_size, pad_size)
49 |
50 | # everything below this point is generic (independent of spatial attacks)
51 | self.x_image = x
52 | x = tf.map_fn(lambda img: tf.image.per_image_standardization(img), x)
53 |
54 | x = self._conv('init_conv', x, 3, 3, 16, self._stride_arr(1))
55 |
56 | strides = [1, 2, 2]
57 | activate_before_residual = [True, False, False]
58 | res_func = self._residual
59 |
60 | with tf.variable_scope('unit_1_0'):
61 | x = res_func(x, filters[0], filters[1], self._stride_arr(strides[0]),
62 | activate_before_residual[0])
63 | for i in range(1, 5):
64 | with tf.variable_scope('unit_1_%d' % i):
65 | x = res_func(x, filters[1], filters[1], self._stride_arr(1), False)
66 |
67 | with tf.variable_scope('unit_2_0'):
68 | x = res_func(x, filters[1], filters[2], self._stride_arr(strides[1]),
69 | activate_before_residual[1])
70 | for i in range(1, 5):
71 | with tf.variable_scope('unit_2_%d' % i):
72 | x = res_func(x, filters[2], filters[2], self._stride_arr(1), False)
73 |
74 | with tf.variable_scope('unit_3_0'):
75 | x = res_func(x, filters[2], filters[3], self._stride_arr(strides[2]),
76 | activate_before_residual[2])
77 | for i in range(1, 5):
78 | with tf.variable_scope('unit_3_%d' % i):
79 | x = res_func(x, filters[3], filters[3], self._stride_arr(1), False)
80 |
81 | with tf.variable_scope('unit_last'):
82 | x = self._batch_norm('final_bn', x)
83 | x = self._relu(x, 0.1)
84 | x = self._global_avg_pool(x)
85 |
86 | # uncomment to add and extra fc layer
87 | #with tf.variable_scope('unit_fc'):
88 | # self.pre_softmax = self._fully_connected(x, 1024)
89 | # x = self._relu(x, 0.1)
90 |
91 | with tf.variable_scope('logit'):
92 | self.pre_softmax = self._fully_connected(x, 10)
93 |
94 | self.predictions = tf.argmax(self.pre_softmax, 1)
95 | self.correct_prediction = tf.equal(self.predictions, self.y_input)
96 | self.num_correct = tf.reduce_sum(
97 | tf.cast(self.correct_prediction, tf.int64))
98 | self.accuracy = tf.reduce_mean(
99 | tf.cast(self.correct_prediction, tf.float32))
100 |
101 | with tf.variable_scope('costs'):
102 | self.y_xent = tf.nn.sparse_softmax_cross_entropy_with_logits(
103 | logits=self.pre_softmax, labels=self.y_input)
104 | self.xent = tf.reduce_sum(self.y_xent, name='y_xent')
105 | self.mean_xent = tf.reduce_mean(self.y_xent)
106 | self.weight_decay_loss = self._decay()
107 |
108 | def _batch_norm(self, name, x):
109 | """Batch normalization."""
110 | with tf.name_scope(name):
111 | return tf.contrib.layers.batch_norm(
112 | inputs=x,
113 | decay=.9,
114 | center=True,
115 | scale=True,
116 | activation_fn=None,
117 | updates_collections=None,
118 | is_training=self.is_training)
119 |
120 | def _residual(self, x, in_filter, out_filter, stride,
121 | activate_before_residual=False):
122 | """Residual unit with 2 sub layers."""
123 | if activate_before_residual:
124 | with tf.variable_scope('shared_activation'):
125 | x = self._batch_norm('init_bn', x)
126 | x = self._relu(x, 0.1)
127 | orig_x = x
128 | else:
129 | with tf.variable_scope('residual_only_activation'):
130 | orig_x = x
131 | x = self._batch_norm('init_bn', x)
132 | x = self._relu(x, 0.1)
133 |
134 | with tf.variable_scope('sub1'):
135 | x = self._conv('conv1', x, 3, in_filter, out_filter, stride)
136 |
137 | with tf.variable_scope('sub2'):
138 | x = self._batch_norm('bn2', x)
139 | x = self._relu(x, 0.1)
140 | x = self._conv('conv2', x, 3, out_filter, out_filter, [1, 1, 1, 1])
141 |
142 | with tf.variable_scope('sub_add'):
143 | if in_filter != out_filter:
144 | orig_x = tf.nn.avg_pool(orig_x, stride, stride, 'VALID')
145 | orig_x = tf.pad(
146 | orig_x, [[0, 0], [0, 0], [0, 0],
147 | [(out_filter-in_filter)//2, (out_filter-in_filter)//2]])
148 | x += orig_x
149 |
150 | tf.logging.debug('image after unit %s', x.get_shape())
151 | return x
152 |
153 | def _decay(self):
154 | """L2 weight decay loss."""
155 | costs = []
156 | for var in tf.trainable_variables():
157 | if var.op.name.find('DW') >= 0:
158 | costs.append(tf.nn.l2_loss(var))
159 | return tf.add_n(costs)
160 |
161 | def _conv(self, name, x, filter_size, in_filters, out_filters, strides):
162 | """Convolution."""
163 | with tf.variable_scope(name):
164 | n = filter_size * filter_size * out_filters
165 | kernel = tf.get_variable(
166 | 'DW', [filter_size, filter_size, in_filters, out_filters],
167 | tf.float32, initializer=tf.random_normal_initializer(
168 | stddev=np.sqrt(2.0/n)))
169 | return tf.nn.conv2d(x, kernel, strides, padding='SAME')
170 |
171 | def _relu(self, x, leakiness=0.0):
172 | """Relu, with optional leaky support."""
173 | return tf.where(tf.less(x, 0.0), leakiness * x, x, name='leaky_relu')
174 |
175 | def _fully_connected(self, x, out_dim):
176 | """FullyConnected layer for final output."""
177 | num_non_batch_dimensions = len(x.shape)
178 | prod_non_batch_dimensions = 1
179 | for ii in range(num_non_batch_dimensions - 1):
180 | prod_non_batch_dimensions *= int(x.shape[ii + 1])
181 | x = tf.reshape(x, [tf.shape(x)[0], -1])
182 | w = tf.get_variable(
183 | 'DW', [prod_non_batch_dimensions, out_dim],
184 | initializer=tf.uniform_unit_scaling_initializer(factor=1.0))
185 | b = tf.get_variable('biases', [out_dim],
186 | initializer=tf.constant_initializer())
187 | return tf.nn.xw_plus_b(x, w, b)
188 |
189 | def _global_avg_pool(self, x):
190 | assert x.get_shape().ndims == 4
191 | return tf.reduce_mean(x, [1, 2])
192 |
193 |
194 |
--------------------------------------------------------------------------------
/cleverhans_models.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from __future__ import division
3 | from __future__ import print_function
4 |
5 | from collections import OrderedDict
6 | from cleverhans.model import Model
7 | from cleverhans.utils import deterministic_dict
8 | from cleverhans.dataset import Factory, MNIST
9 | import numpy as np
10 | import tensorflow as tf
11 | from cleverhans.serial import NoRefModel
12 |
13 |
14 | class Layer(object):
15 |
16 | def get_output_shape(self):
17 | return self.output_shape
18 |
19 |
20 | class ResNet(NoRefModel):
21 | """ResNet model."""
22 |
23 | def __init__(self, layers, input_shape, scope=None):
24 | """ResNet constructor.
25 | :param layers: a list of layers in CleverHans format
26 | each with set_input_shape() and fprop() methods.
27 | :param input_shape: 4-tuple describing input shape (e.g None, 32, 32, 3)
28 | :param scope: string name of scope for Variables
29 | This works in two ways.
30 | If scope is None, the variables are not put in a scope, and the
31 | model is compatible with Saver.restore from the public downloads
32 | for the CIFAR10 Challenge.
33 | If the scope is a string, then Saver.restore won't work, but the
34 | model functions as a picklable NoRefModels that finds its variables
35 | based on the scope.
36 | """
37 | super(ResNet, self).__init__(scope, 10, {}, scope is not None)
38 | if scope is None:
39 | before = list(tf.trainable_variables())
40 | before_vars = list(tf.global_variables())
41 | self.build(layers, input_shape)
42 | after = list(tf.trainable_variables())
43 | after_vars = list(tf.global_variables())
44 | self.params = [param for param in after if param not in before]
45 | self.vars = [var for var in after_vars if var not in before_vars]
46 | else:
47 | with tf.variable_scope(self.scope, reuse=tf.AUTO_REUSE):
48 | self.build(layers, input_shape)
49 |
50 | def get_vars(self):
51 | if hasattr(self, "vars"):
52 | return self.vars
53 | return super(ResNet, self).get_vars()
54 |
55 | def build(self, layers, input_shape):
56 | self.layer_names = []
57 | self.layers = layers
58 | self.input_shape = input_shape
59 | if isinstance(layers[-1], Softmax):
60 | layers[-1].name = 'probs'
61 | layers[-2].name = 'logits'
62 | else:
63 | layers[-1].name = 'logits'
64 | for i, layer in enumerate(self.layers):
65 | if hasattr(layer, 'name'):
66 | name = layer.name
67 | else:
68 | name = layer.__class__.__name__ + str(i)
69 | layer.name = name
70 | self.layer_names.append(name)
71 |
72 | layer.set_input_shape(input_shape)
73 | input_shape = layer.get_output_shape()
74 |
75 | def make_input_placeholder(self):
76 | return tf.placeholder(tf.float32, (None, 32, 32, 3))
77 |
78 | def make_label_placeholder(self):
79 | return tf.placeholder(tf.float32, (None, 10))
80 |
81 | def fprop(self, x, set_ref=False):
82 | x = x * 255.0
83 | if self.scope is not None:
84 | with tf.variable_scope(self.scope, reuse=tf.AUTO_REUSE):
85 | return self._fprop(x, set_ref)
86 | return self._fprop(x, set_ref)
87 |
88 | def _fprop(self, x, set_ref=False):
89 | states = []
90 | for layer in self.layers:
91 | if set_ref:
92 | layer.ref = x
93 | x = layer.fprop(x)
94 | assert x is not None
95 | states.append(x)
96 | states = dict(zip(self.layer_names, states))
97 | return states
98 |
99 | def add_internal_summaries(self):
100 | pass
101 |
102 |
103 | def _stride_arr(stride):
104 | """Map a stride scalar to the stride array for tf.nn.conv2d."""
105 | return [1, stride, stride, 1]
106 |
107 |
108 | class Input(Layer):
109 |
110 | def __init__(self):
111 | pass
112 |
113 | def set_input_shape(self, input_shape):
114 | batch_size, rows, cols, input_channels = input_shape
115 | # assert self.mode == 'train' or self.mode == 'eval'
116 | """Build the core model within the graph."""
117 | input_shape = list(input_shape)
118 | input_shape[0] = 1
119 | dummy_batch = tf.zeros(input_shape)
120 | dummy_output = self.fprop(dummy_batch)
121 | output_shape = [int(e) for e in dummy_output.get_shape()]
122 | output_shape[0] = batch_size
123 | self.output_shape = tuple(output_shape)
124 |
125 | def fprop(self, x):
126 | with tf.variable_scope('input', reuse=tf.AUTO_REUSE):
127 | input_standardized = tf.map_fn(
128 | lambda img: tf.image.per_image_standardization(img), x)
129 | return _conv('init_conv', input_standardized,
130 | 3, 3, 16, _stride_arr(1))
131 |
132 | class Conv2D(Layer):
133 |
134 | def __init__(self, filters):
135 | self.filters = filters
136 |
137 | assert filters == [16, 16, 32, 64] or filters == [16, 160, 320, 640]
138 |
139 | pass
140 |
141 | def set_input_shape(self, input_shape):
142 | batch_size, rows, cols, input_channels = input_shape
143 |
144 | # Uncomment the following codes to use w28-10 wide residual network.
145 | # It is more memory efficient than very deep residual network and has
146 | # comparably good performance.
147 | # https://arxiv.org/pdf/1605.07146v1.pdf
148 | input_shape = list(input_shape)
149 | input_shape[0] = 1
150 | dummy_batch = tf.zeros(input_shape)
151 | dummy_output = self.fprop(dummy_batch)
152 | output_shape = [int(e) for e in dummy_output.get_shape()]
153 | output_shape[0] = batch_size
154 | self.output_shape = tuple(output_shape)
155 |
156 | def fprop(self, x):
157 |
158 | # Update hps.num_residual_units to 9
159 | strides = [1, 2, 2]
160 | activate_before_residual = [True, False, False]
161 | filters = self.filters
162 |
163 | res_func = _residual
164 | with tf.variable_scope('unit_1_0', reuse=tf.AUTO_REUSE):
165 | x = res_func(x, filters[0], filters[1], _stride_arr(strides[0]),
166 | activate_before_residual[0])
167 | for i in range(1, 5):
168 | with tf.variable_scope(('unit_1_%d' % i), reuse=tf.AUTO_REUSE):
169 | x = res_func(x, filters[1], filters[1],
170 | _stride_arr(1), False)
171 |
172 | with tf.variable_scope(('unit_2_0'), reuse=tf.AUTO_REUSE):
173 | x = res_func(x, filters[1], filters[2], _stride_arr(strides[1]),
174 | activate_before_residual[1])
175 | for i in range(1, 5):
176 | with tf.variable_scope(('unit_2_%d' % i), reuse=tf.AUTO_REUSE):
177 | x = res_func(x, filters[2], filters[2],
178 | _stride_arr(1), False)
179 |
180 | with tf.variable_scope(('unit_3_0'), reuse=tf.AUTO_REUSE):
181 | x = res_func(x, filters[2], filters[3], _stride_arr(strides[2]),
182 | activate_before_residual[2])
183 | for i in range(1, 5):
184 | with tf.variable_scope(('unit_3_%d' % i), reuse=tf.AUTO_REUSE):
185 | x = res_func(x, filters[3], filters[3],
186 | _stride_arr(1), False)
187 |
188 | with tf.variable_scope(('unit_last'), reuse=tf.AUTO_REUSE):
189 | x = _batch_norm('final_bn', x)
190 | x = _relu(x, 0.1)
191 | x = _global_avg_pool(x)
192 |
193 | return x
194 |
195 |
196 | class Linear(Layer):
197 |
198 | def __init__(self, num_hid):
199 | self.num_hid = num_hid
200 |
201 | def set_input_shape(self, input_shape):
202 | batch_size, dim = input_shape
203 | self.input_shape = [batch_size, dim]
204 | self.dim = dim
205 | self.output_shape = [batch_size, self.num_hid]
206 | self.make_vars()
207 |
208 | def make_vars(self):
209 | with tf.variable_scope('logit', reuse=tf.AUTO_REUSE):
210 | w = tf.get_variable(
211 | 'DW', [self.dim, self.num_hid],
212 | initializer=tf.initializers.variance_scaling(
213 | distribution='uniform'))
214 | b = tf.get_variable('biases', [self.num_hid],
215 | initializer=tf.initializers.constant())
216 | return w, b
217 |
218 | def fprop(self, x):
219 | w, b = self.make_vars()
220 | return tf.nn.xw_plus_b(x, w, b)
221 |
222 |
223 | def _batch_norm(name, x):
224 | """Batch normalization."""
225 | with tf.name_scope(name):
226 | return tf.contrib.layers.batch_norm(
227 | inputs=x,
228 | decay=.9,
229 | center=True,
230 | scale=True,
231 | activation_fn=None,
232 | updates_collections=None,
233 | is_training=False)
234 |
235 |
236 | def _residual(x, in_filter, out_filter, stride,
237 | activate_before_residual=False):
238 | """Residual unit with 2 sub layers."""
239 | if activate_before_residual:
240 | with tf.variable_scope('shared_activation', reuse=tf.AUTO_REUSE):
241 | x = _batch_norm('init_bn', x)
242 | x = _relu(x, 0.1)
243 | orig_x = x
244 | else:
245 | with tf.variable_scope('residual_only_activation', reuse=tf.AUTO_REUSE):
246 | orig_x = x
247 | x = _batch_norm('init_bn', x)
248 | x = _relu(x, 0.1)
249 |
250 | with tf.variable_scope('sub1', reuse=tf.AUTO_REUSE):
251 | x = _conv('conv1', x, 3, in_filter, out_filter, stride)
252 |
253 | with tf.variable_scope('sub2', reuse=tf.AUTO_REUSE):
254 | x = _batch_norm('bn2', x)
255 | x = _relu(x, 0.1)
256 | x = _conv('conv2', x, 3, out_filter, out_filter, [1, 1, 1, 1])
257 |
258 | with tf.variable_scope('sub_add', reuse=tf.AUTO_REUSE):
259 | if in_filter != out_filter:
260 | orig_x = tf.nn.avg_pool(orig_x, stride, stride, 'VALID')
261 | orig_x = tf.pad(
262 | orig_x, [[0, 0], [0, 0],
263 | [0, 0], [(out_filter - in_filter) // 2,
264 | (out_filter - in_filter) // 2]])
265 | x += orig_x
266 |
267 | tf.logging.debug('image after unit %s', x.get_shape())
268 | return x
269 |
270 |
271 | def _decay():
272 | """L2 weight decay loss."""
273 | costs = []
274 | for var in tf.trainable_variables():
275 | if var.op.name.find('DW') > 0:
276 | costs.append(tf.nn.l2_loss(var))
277 | return tf.add_n(costs)
278 |
279 |
280 | def _conv(name, x, filter_size, in_filters, out_filters, strides):
281 | """Convolution."""
282 | with tf.variable_scope(name, reuse=tf.AUTO_REUSE):
283 | n = filter_size * filter_size * out_filters
284 | kernel = tf.get_variable(
285 | 'DW', [filter_size, filter_size, in_filters, out_filters],
286 | tf.float32, initializer=tf.random_normal_initializer(
287 | stddev=np.sqrt(2.0 / n)))
288 | return tf.nn.conv2d(x, kernel, strides, padding='SAME')
289 |
290 |
291 | def _relu(x, leakiness=0.0):
292 | """Relu, with optional leaky support."""
293 | return tf.where(tf.less(x, 0.0), leakiness * x, x, name='leaky_relu')
294 |
295 |
296 | def _global_avg_pool(x):
297 | assert x.get_shape().ndims == 4
298 | return tf.reduce_mean(x, [1, 2])
299 |
300 |
301 | class Softmax(Layer):
302 |
303 | def __init__(self):
304 | pass
305 |
306 | def set_input_shape(self, shape):
307 | self.input_shape = shape
308 | self.output_shape = shape
309 |
310 | def fprop(self, x):
311 | return tf.nn.softmax(x)
312 |
313 |
314 | class Flatten(Layer):
315 |
316 | def __init__(self):
317 | pass
318 |
319 | def set_input_shape(self, shape):
320 | self.input_shape = shape
321 | output_width = 1
322 | for factor in shape[1:]:
323 | output_width *= factor
324 | self.output_width = output_width
325 | self.output_shape = [None, output_width]
326 |
327 | def fprop(self, x):
328 | return tf.reshape(x, [-1, self.output_width])
329 |
330 |
331 | def make_wresnet(nb_classes=10, input_shape=(None, 32, 32, 3), scope=None, filters=None):
332 | layers = [Input(),
333 | Conv2D(filters=filters), # the whole ResNet is basically created in this layer
334 | Flatten(),
335 | Linear(nb_classes),
336 | Softmax()]
337 |
338 | model = ResNet(layers, input_shape, scope)
339 | return model
340 |
341 |
342 | class MadryMNIST(Model):
343 |
344 | def __init__(self, nb_classes=10):
345 | # NOTE: for compatibility with Madry Lab downloadable checkpoints,
346 | # we cannot use scopes, give these variables names, etc.
347 |
348 | """
349 | self.conv1 = tf.layers.Conv2D(32, (5, 5), activation='relu', padding='same', name='conv1')
350 | self.pool1 = tf.layers.MaxPooling2D((2, 2), (2, 2), padding='same')
351 | self.conv2 = tf.layers.Conv2D(64, (5, 5), activation='relu', padding='same', name='conv2')
352 | self.pool2 = tf.layers.MaxPooling2D((2, 2), (2, 2), padding='same')
353 | self.fc1 = tf.layers.Dense(1024, activation='relu', name='fc1')
354 | self.fc2 = tf.layers.Dense(10, name='fc2')
355 | """
356 |
357 | keras_model = tf.keras.Sequential()
358 | keras_model.add(tf.keras.layers.Conv2D(32, (5, 5), activation='relu', padding='same', name='conv1',
359 | input_shape=(28, 28, 1)))
360 | keras_model.add(tf.keras.layers.MaxPooling2D((2, 2), (2, 2), padding='same'))
361 | keras_model.add(tf.keras.layers.Conv2D(64, (5, 5), activation='relu', padding='same', name='conv2'))
362 | keras_model.add(tf.keras.layers.MaxPooling2D((2, 2), (2, 2), padding='same'))
363 | keras_model.add(tf.keras.layers.Flatten())
364 | keras_model.add(tf.keras.layers.Dense(1024, activation='relu', name='fc1'))
365 | keras_model.add(tf.keras.layers.Dense(10, name='fc2'))
366 |
367 | self.keras_model = keras_model
368 | Model.__init__(self, '', nb_classes, {})
369 | self.dataset_factory = Factory(MNIST, {"center": False})
370 |
371 | def fprop(self, x):
372 |
373 | output = OrderedDict()
374 | logits = self.keras_model(x)
375 |
376 | output = deterministic_dict(locals())
377 | del output["self"]
378 | output[self.O_PROBS] = tf.nn.softmax(logits=logits)
379 |
380 | return output
381 |
--------------------------------------------------------------------------------
/config.json:
--------------------------------------------------------------------------------
1 | {
2 | "_comment": "===== MODEL CONFIGURATION =====",
3 | "data": "mnist",
4 | "model_type": "cnn",
5 |
6 | "_comment": "===== TRAINING CONFIGURATION =====",
7 | "random_seed": 0,
8 | "max_num_training_steps": 6000,
9 | "num_output_steps": 600,
10 | "num_summary_steps": 600,
11 | "num_checkpoint_steps": 600,
12 | "training_batch_size": 100,
13 | "step_size_schedule": [[0, 0.001], [3000, 0.0001]],
14 |
15 | "_comment": "===== EVAL CONFIGURATION =====",
16 | "num_eval_examples": 100,
17 | "eval_batch_size": 100,
18 | "eval_on_cpu": false,
19 |
20 | "_comment": "=====ADVERSARIAL EXAMPLES CONFIGURATION=====",
21 |
22 | "_comment": "One of: '', 'MAX'",
23 | "multi_attack_mode": "MAX",
24 | "start_small": true,
25 | "attacks": [
26 | {"type": "linf", "epsilon": 0.3, "k": 10, "a": 0.01, "random_start": true},
27 | {"type": "l2", "epsilon": 2, "k": 10, "a": 0.1, "random_start": true},
28 | {"type": "l1", "epsilon": 10, "k": 10, "random_start": true, "perc": 99, "a": 1.0}
29 | ],
30 | "train_attacks": [0, 1, 2],
31 | "eval_attacks": [0, 1, 2]
32 | }
33 |
--------------------------------------------------------------------------------
/config_cifar10.json:
--------------------------------------------------------------------------------
1 | {
2 | "_comment": "===== MODEL CONFIGURATION =====",
3 | "data": "cifar10",
4 | "data_path": "cifar10_data",
5 | "model_type": "cnn",
6 | "filters": [16, 160, 320, 640],
7 |
8 | "_comment": "===== TRAINING CONFIGURATION =====",
9 | "random_seed": 0,
10 | "max_num_training_steps": 80000,
11 | "num_output_steps": 200,
12 | "num_summary_steps": 5000,
13 | "num_checkpoint_steps": 5000,
14 | "training_batch_size": 128,
15 | "step_size_schedule": [[0, 0.1], [40000, 0.01], [60000, 0.001]],
16 | "weight_decay": 0.0002,
17 | "momentum": 0.9,
18 |
19 | "_comment": "===== EVAL CONFIGURATION =====",
20 | "num_eval_examples": 1000,
21 | "eval_batch_size": 100,
22 | "eval_on_cpu": false,
23 |
24 | "_comment": "=====ADVERSARIAL EXAMPLES CONFIGURATION=====",
25 | "_comment": "One of: '', 'ALTERNATE', 'MAX', 'HALF_BATCH_HALF_LR'",
26 | "multi_attack_mode": "MAX",
27 | "attacks": [
28 | {"type": "linf", "epsilon": 8.0, "k": 10, "random_start": true},
29 | {"type": "l2", "epsilon": 80, "k": 40, "random_start": true},
30 | {"type": "l1", "epsilon": 2000, "k": 100, "random_start": true, "perc": 99, "a": 2.0},
31 | {"type": "RT", "spatial_limits": [3, 3, 30], "grid_granularity": [5, 5, 31], "random_tries": 10},
32 | {"type": "RT", "spatial_limits": [3, 3, 30], "grid_granularity": [5, 5, 31], "random_tries": -1}
33 | ],
34 | "train_attacks": [0],
35 | "eval_attacks": [0]
36 | }
37 |
--------------------------------------------------------------------------------
/eval.py:
--------------------------------------------------------------------------------
1 | """
2 | Infinite evaluation loop going through the checkpoints in the model directory
3 | as they appear and evaluating them. Accuracy and average loss are printed and
4 | added as tensorboard summaries.
5 | """
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import matplotlib
11 |
12 | matplotlib.use('Agg')
13 | import matplotlib.pyplot as plt
14 | import json
15 | import math
16 | import os
17 |
18 | import numpy as np
19 | import tensorflow as tf
20 | import argparse
21 |
22 | from pgd_attack import PGDAttack, compute_grad
23 |
24 | rows = cols = 8
25 |
26 |
27 | def show_images(images, cols=1, figpath="figure.png"):
28 | """Display a list of images in a single figure with matplotlib.
29 |
30 | Parameters
31 | ---------
32 | images: List of np.arrays compatible with plt.imshow.
33 |
34 | cols (Default = 1): Number of columns in figure (number of rows is
35 | set to np.ceil(n_images/float(cols))).
36 |
37 | titles: List of titles corresponding to each image. Must have
38 | the same length as titles.
39 | """
40 | n_images = len(images)
41 | fig = plt.figure()
42 | for n, image in enumerate(images):
43 | a = fig.add_subplot(cols, np.ceil(n_images / float(cols)), n + 1)
44 | if image.ndim == 2:
45 | plt.gray()
46 | if np.max(image) > 1.0:
47 | image = image.astype(np.uint8)
48 |
49 | plt.imshow(image)
50 | plt.savefig(figpath)
51 | plt.close()
52 |
53 |
54 | # A function for evaluating a single checkpoint
55 | def evaluate(model, eval_attacks, sess, config, plot=False, summary_writer=None, eval_train=False, eval_validation=False, verbose=True):
56 | num_eval_examples = config['num_eval_examples']
57 | eval_batch_size = config['eval_batch_size']
58 |
59 | dataset = config["data"]
60 | assert dataset in ["mnist", "cifar10"]
61 |
62 | if dataset == "mnist":
63 | from tensorflow.examples.tutorials.mnist import input_data
64 | mnist = input_data.read_data_sets('MNIST_data', one_hot=False)
65 | if "model_type" in config and config["model_type"] == "linear":
66 | x_train = mnist.train.images
67 | y_train = mnist.train.labels
68 | x_test = mnist.test.images
69 | y_test = mnist.test.labels
70 |
71 | pos_train = (y_train == 5) | (y_train == 7)
72 | x_train = x_train[pos_train]
73 | y_train = y_train[pos_train]
74 | y_train = (y_train == 5).astype(np.int64)
75 | pos_test = (y_test == 5) | (y_test == 7)
76 | x_test = x_test[pos_test]
77 | y_test = y_test[pos_test]
78 | y_test = (y_test == 5).astype(np.int64)
79 |
80 | from tensorflow.contrib.learn.python.learn.datasets.mnist import DataSet
81 | from tensorflow.contrib.learn.python.learn.datasets import base
82 |
83 | options = dict(dtype=tf.uint8, reshape=False, seed=None)
84 | train = DataSet(x_train, y_train, **options)
85 | test = DataSet(x_test, y_test, **options)
86 |
87 | mnist = base.Datasets(train=train, validation=None, test=test)
88 | else:
89 | import cifar10_input
90 | data_path = config["data_path"]
91 | cifar = cifar10_input.CIFAR10Data(data_path)
92 |
93 | np.random.seed(0)
94 | tf.random.set_random_seed(0)
95 | global_step = tf.contrib.framework.get_or_create_global_step()
96 |
97 | # Iterate over the samples batch-by-batch
98 | num_batches = int(math.ceil(num_eval_examples / eval_batch_size))
99 | total_xent_nat = 0.
100 | total_xent_advs = np.zeros(len(eval_attacks), dtype=np.float32)
101 | total_corr_nat = 0.
102 | total_corr_advs = [[] for _ in range(len(eval_attacks))]
103 |
104 | l1_norms = [[] for _ in range(len(eval_attacks))]
105 | l2_norms = [[] for _ in range(len(eval_attacks))]
106 | linf_norms = [[] for _ in range(len(eval_attacks))]
107 |
108 | for ibatch in range(num_batches):
109 | bstart = ibatch * eval_batch_size
110 | bend = min(bstart + eval_batch_size, num_eval_examples)
111 |
112 | if eval_train:
113 | if dataset == "mnist":
114 | x_batch = mnist.train.images[bstart:bend, :].reshape(-1, 28, 28, 1)
115 | y_batch = mnist.train.labels[bstart:bend]
116 | else:
117 | x_batch = cifar.train_data.xs[bstart:bend, :].astype(np.float32)
118 | y_batch = cifar.train_data.ys[bstart:bend]
119 | elif eval_validation:
120 | assert dataset == "cifar10"
121 | offset = len(cifar.eval_data.ys) - num_eval_examples
122 | x_batch = cifar.eval_data.xs[offset+bstart:offset+bend, :].astype(np.float32)
123 | y_batch = cifar.eval_data.ys[offset+bstart:offset+bend]
124 |
125 | else:
126 | if dataset == "mnist":
127 | x_batch = mnist.test.images[bstart:bend, :].reshape(-1, 28, 28, 1)
128 | y_batch = mnist.test.labels[bstart:bend]
129 | else:
130 | x_batch = cifar.eval_data.xs[bstart:bend, :].astype(np.float32)
131 | y_batch = cifar.eval_data.ys[bstart:bend]
132 |
133 | noop_trans = np.zeros([len(x_batch), 3])
134 | dict_nat = {model.x_input: x_batch,
135 | model.y_input: y_batch,
136 | model.is_training: False,
137 | model.transform: noop_trans}
138 |
139 | cur_corr_nat, cur_xent_nat = sess.run(
140 | [model.num_correct, model.xent],
141 | feed_dict=dict_nat)
142 |
143 | total_xent_nat += cur_xent_nat
144 | total_corr_nat += cur_corr_nat
145 |
146 | for i, attack in enumerate(eval_attacks):
147 | x_batch_adv, adv_trans = attack.perturb(x_batch, y_batch, sess)
148 |
149 | dict_adv = {model.x_input: x_batch_adv,
150 | model.y_input: y_batch,
151 | model.is_training: False,
152 | model.transform: adv_trans if adv_trans is not None else np.zeros([len(x_batch), 3])}
153 |
154 | cur_corr_adv, cur_xent_adv, cur_corr_pred, cur_adv_images = \
155 | sess.run([model.num_correct, model.xent, model.correct_prediction, model.x_image],
156 | feed_dict=dict_adv)
157 |
158 | total_xent_advs[i] += cur_xent_adv
159 | total_corr_advs[i].extend(cur_corr_pred)
160 |
161 | l1_norms[i].extend(np.sum(np.abs(x_batch_adv - x_batch).reshape(len(x_batch), -1), axis=-1))
162 | l2_norms[i].extend(np.linalg.norm((x_batch_adv - x_batch).reshape(len(x_batch), -1), axis=-1))
163 | linf_norms[i].extend(np.max(np.abs(x_batch_adv - x_batch).reshape(len(x_batch), -1), axis=-1))
164 |
165 | avg_xent_nat = total_xent_nat / num_eval_examples
166 | acc_nat = total_corr_nat / num_eval_examples
167 |
168 | avg_xent_advs = total_xent_advs / num_eval_examples
169 | acc_advs = np.sum(total_corr_advs, axis=-1) / num_eval_examples
170 |
171 | if len(eval_attacks) > 0:
172 | tot_correct = np.bitwise_and.reduce(np.asarray(total_corr_advs), 0)
173 | assert len(tot_correct) == num_eval_examples
174 | any_acc = np.sum(tot_correct) / num_eval_examples
175 |
176 | if verbose:
177 | print('natural: {:.2f}%'.format(100 * acc_nat))
178 | for i, attack in enumerate(eval_attacks):
179 | t = attack.name
180 | print('adversarial ({}):'.format(t))
181 | print('\tacc: {:.2f}% '.format(100 * acc_advs[i]))
182 | print("\tmean(l1)={:.1f}, min(l1)={:.1f}, max(l1)={:.1f}".format(
183 | np.mean(l1_norms[i]), np.min(l1_norms[i]), np.max(l1_norms[i])))
184 | print("\tmean(l2)={:.1f}, min(l2)={:.1f}, max(l2)={:.1f}".format(
185 | np.mean(l2_norms[i]), np.min(l2_norms[i]), np.max(l2_norms[i])))
186 | print("\tmean(linf)={:.1f}, min(linf)={:.1f}, max(linf)={:.1f}".format(
187 | np.mean(linf_norms[i]), np.min(linf_norms[i]), np.max(linf_norms[i])))
188 |
189 | print('avg nat loss: {:.2f}'.format(avg_xent_nat))
190 | for i, attack in enumerate(eval_attacks):
191 | t = attack.name
192 | print('avg adv loss ({}): {:.2f}'.format(t, avg_xent_advs[i]))
193 |
194 | if len(eval_attacks) > 0:
195 | print("any attack: {:.2f}%".format(100 * any_acc))
196 |
197 | if summary_writer:
198 |
199 | values = [
200 | tf.Summary.Value(tag='xent nat', simple_value=avg_xent_nat),
201 | tf.Summary.Value(tag='accuracy nat', simple_value=acc_nat)
202 | ]
203 | if len(eval_attacks) > 0:
204 | values.append(tf.Summary.Value(tag='accuracy adv any', simple_value=any_acc))
205 |
206 | for i, attack in enumerate(eval_attacks):
207 | t = attack.name
208 | adv_values = [
209 | tf.Summary.Value(tag='xent adv eval ({})'.format(t), simple_value=avg_xent_advs[i]),
210 | tf.Summary.Value(tag='xent adv ({})'.format(t), simple_value=avg_xent_advs[i]),
211 | tf.Summary.Value(tag='accuracy adv eval ({})'.format(t), simple_value=acc_advs[i]),
212 | tf.Summary.Value(tag='accuracy adv ({})'.format(t), simple_value=acc_advs[i])
213 | ]
214 | values.extend(adv_values)
215 |
216 | summary = tf.Summary(value=values)
217 | summary_writer.add_summary(summary, global_step.eval(sess))
218 |
219 | return acc_nat, total_corr_advs
220 |
221 |
222 | if __name__ == "__main__":
223 | parser = argparse.ArgumentParser(
224 | description='Eval script options',
225 | formatter_class=argparse.ArgumentDefaultsHelpFormatter)
226 | parser.add_argument('model_dir', type=str,
227 | help='path to model directory')
228 | parser.add_argument('--epoch', type=int, default=None,
229 | help='specific epoch to load (default=latest)')
230 | parser.add_argument('--eval_train', help='evaluate on training set',
231 | action="store_true")
232 | parser.add_argument('--eval_cpu', help='evaluate on CPU',
233 | action="store_true")
234 | args = parser.parse_args()
235 |
236 | if args.eval_cpu:
237 | os.environ['CUDA_VISIBLE_DEVICES'] = ''
238 |
239 | model_dir = args.model_dir
240 |
241 | with open(model_dir + "/config.json") as config_file:
242 | config = json.load(config_file)
243 |
244 | eval_attack_configs = [np.asarray(config["attacks"])[i] for i in config["eval_attacks"]]
245 | print(eval_attack_configs)
246 |
247 | dataset = config["data"]
248 | if dataset == "mnist":
249 | from model import Model
250 | model = Model(config)
251 |
252 | x_min, x_max = 0.0, 1.0
253 | else:
254 | from cifar10_model import Model
255 | model = Model(config)
256 | x_min, x_max = 0.0, 255.0
257 |
258 | grad = compute_grad(model)
259 | eval_attacks = [PGDAttack(model, a_config, x_min, x_max, grad) for a_config in eval_attack_configs]
260 |
261 | global_step = tf.contrib.framework.get_or_create_global_step()
262 |
263 | if not os.path.exists(model_dir):
264 | os.makedirs(model_dir)
265 | eval_dir = os.path.join(model_dir, 'eval')
266 | if not os.path.exists(eval_dir):
267 | os.makedirs(eval_dir)
268 |
269 | saver = tf.train.Saver()
270 |
271 | if args.epoch is not None:
272 | ckpts = tf.train.get_checkpoint_state(model_dir).all_model_checkpoint_paths
273 | ckpt = [c for c in ckpts if c.endswith('checkpoint-{}'.format(args.epoch))]
274 | assert len(ckpt) == 1
275 | cur_checkpoint = ckpt[0]
276 | else:
277 | cur_checkpoint = tf.train.latest_checkpoint(model_dir)
278 | assert cur_checkpoint is not None
279 |
280 | config_tf = tf.ConfigProto()
281 | config_tf.gpu_options.allow_growth = True
282 | config_tf.gpu_options.per_process_gpu_memory_fraction = 1.0
283 |
284 | with tf.Session(config=config_tf) as sess:
285 | # Restore the checkpoint
286 | print('Evaluating checkpoint {}'.format(cur_checkpoint))
287 |
288 | saver.restore(sess, cur_checkpoint)
289 |
290 | evaluate(model, eval_attacks, sess, config, plot=True, eval_train=args.eval_train)
291 |
292 |
--------------------------------------------------------------------------------
/eval_ch.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from __future__ import division
3 | from __future__ import print_function
4 |
5 | import json
6 | import math
7 | import os
8 |
9 | import numpy as np
10 | import tensorflow as tf
11 | import argparse
12 |
13 | from cleverhans.utils import set_log_level
14 | from cleverhans.attacks import ElasticNetMethod, CarliniWagnerL2
15 | from cleverhans.evaluation import batch_eval
16 | import logging
17 |
18 |
19 | def one_hot(a, n_classes):
20 | res = np.zeros((len(a), n_classes), dtype=np.int64)
21 | res[np.arange(len(a)), a] = 1
22 | return res
23 |
24 |
25 | def evaluate_ch(model, config, sess, norm='l1', bound=None, verbose=True):
26 | dataset = config['data']
27 | num_eval_examples = config['num_eval_examples']
28 | eval_batch_size = config['eval_batch_size']
29 |
30 | if dataset == "mnist":
31 | from tensorflow.examples.tutorials.mnist import input_data
32 | mnist = input_data.read_data_sets('MNIST_data', one_hot=False)
33 | X = mnist.test.images[0:num_eval_examples, :].reshape(-1, 28, 28, 1)
34 | Y = mnist.test.labels[0:num_eval_examples]
35 | x_image = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])
36 | else:
37 | import cifar10_input
38 | data_path = config["data_path"]
39 | cifar = cifar10_input.CIFAR10Data(data_path)
40 | X = cifar.eval_data.xs[0:num_eval_examples, :].astype(np.float32) / 255.0
41 | Y = cifar.eval_data.ys[0:num_eval_examples]
42 | x_image = tf.placeholder(tf.float32, shape=[None, 32, 32, 3])
43 | assert norm == 'l1'
44 |
45 | if norm=='l2':
46 | attack = CarliniWagnerL2(model, sess)
47 | params = {'batch_size': eval_batch_size, 'binary_search_steps': 9}
48 | else:
49 | attack = ElasticNetMethod(model, sess, clip_min=0.0, clip_max=1.0)
50 | params = {'beta': 1e-2,
51 | 'decision_rule': 'L1',
52 | 'batch_size': eval_batch_size,
53 | 'learning_rate': 1e-2,
54 | 'max_iterations': 1000}
55 |
56 | if verbose:
57 | set_log_level(logging.DEBUG, name="cleverhans")
58 |
59 | y = tf.placeholder(tf.int64, shape=[None, 10])
60 | params['y'] = y
61 | adv_x = attack.generate(x_image, **params)
62 | preds_adv = model.get_predicted_class(adv_x)
63 | preds_nat = model.get_predicted_class(x_image)
64 |
65 | all_preds, all_preds_adv, all_adv_x = batch_eval(
66 | sess, [x_image, y], [preds_nat, preds_adv, adv_x], [X, one_hot(Y, 10)], batch_size=eval_batch_size)
67 |
68 | print('acc nat', np.mean(all_preds == Y))
69 | print('acc adv', np.mean(all_preds_adv == Y))
70 |
71 | if dataset == "cifar10":
72 | X *= 255.0
73 | all_adv_x *= 255.0
74 |
75 | if norm == 'l2':
76 | lps = np.sqrt(np.sum(np.square(all_adv_x - X), axis=(1,2,3)))
77 | else:
78 | lps = np.sum(np.abs(all_adv_x - X), axis=(1,2,3))
79 | print('mean lp: ', np.mean(lps))
80 | for b in [bound, bound/2.0, bound/4.0, bound/8.0]:
81 | print('lp={}, acc={}'.format(b, np.mean((all_preds_adv == Y) | (lps > b))))
82 |
83 | all_corr_adv = (all_preds_adv == Y)
84 | all_corr_nat = (all_preds == Y)
85 | return all_corr_nat, all_corr_adv, lps
86 |
87 |
88 | def get_model(config):
89 | dataset = config["data"]
90 | if dataset == "mnist":
91 | from cleverhans_models import MadryMNIST
92 | model = MadryMNIST()
93 | else:
94 | from cleverhans_models import make_wresnet
95 | model = make_wresnet(scope="a", filters=config["filters"])
96 |
97 | return model
98 |
99 |
100 | def get_saver(config):
101 | dataset = config["data"]
102 | if dataset == "cifar10":
103 | # nasty hack
104 | gvars = tf.global_variables()
105 | saver = tf.train.Saver({v.name[2:-2]: v for v in gvars if v.name[:2] == "a/"})
106 | else:
107 | saver = tf.train.Saver()
108 | return saver
109 |
110 |
111 | if __name__ == "__main__":
112 | parser = argparse.ArgumentParser(
113 | description='Eval script options',
114 | formatter_class=argparse.ArgumentDefaultsHelpFormatter)
115 | parser.add_argument('model_dir', type=str,
116 | help='path to model directory')
117 | parser.add_argument('--epoch', type=int, default=None,
118 | help='specific epoch to load (default=latest)')
119 | parser.add_argument('--eval_cpu', help='evaluate on CPU',
120 | action="store_true")
121 | parser.add_argument('--norm', help='norm to use', choices=['l1', 'l2'], default='l1')
122 | parser.add_argument('--bound', type=float, help='attack noise bound', default=None)
123 |
124 | args = parser.parse_args()
125 |
126 | if args.eval_cpu:
127 | os.environ['CUDA_VISIBLE_DEVICES'] = ''
128 |
129 | model_dir = args.model_dir
130 |
131 | with open(model_dir + "/config.json") as config_file:
132 | config = json.load(config_file)
133 |
134 | model = get_model(config)
135 | saver = get_saver(config)
136 |
137 | if args.epoch is not None:
138 | ckpts = tf.train.get_checkpoint_state(model_dir).all_model_checkpoint_paths
139 | ckpt = [c for c in ckpts if c.endswith('checkpoint-{}'.format(args.epoch))]
140 | assert len(ckpt) == 1
141 | cur_checkpoint = ckpt[0]
142 | else:
143 | cur_checkpoint = tf.train.latest_checkpoint(model_dir)
144 |
145 | assert cur_checkpoint is not None
146 |
147 | config_tf = tf.ConfigProto()
148 | config_tf.gpu_options.allow_growth = True
149 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.1
150 |
151 | with tf.Session(config=config_tf) as sess:
152 | # Restore the checkpoint
153 | print('Evaluating checkpoint {}'.format(cur_checkpoint))
154 | saver.restore(sess, cur_checkpoint)
155 |
156 | evaluate_ch(model, config, sess, args.norm, args.bound)
157 |
--------------------------------------------------------------------------------
/eval_fb.py:
--------------------------------------------------------------------------------
1 | from __future__ import absolute_import
2 | from __future__ import division
3 | from __future__ import print_function
4 |
5 | import os
6 | import math
7 | import json
8 | from tqdm import tqdm
9 |
10 | import numpy as np
11 | import tensorflow as tf
12 |
13 | import argparse
14 | import foolbox
15 |
16 |
17 |
18 | def evaluate_fb(model, config, x_min, x_max, norm='l1', bound=None, verbose=True):
19 | fmodel = foolbox.models.TensorFlowModel(model.x_input, model.pre_softmax, (x_min, x_max))
20 |
21 | if norm == 'l2':
22 | attack = foolbox.attacks.BoundaryAttack(fmodel)
23 | else:
24 | attack = foolbox.attacks.PointwiseAttack(fmodel)
25 |
26 | dataset = config["data"]
27 | num_eval_examples = config['num_eval_examples']
28 | eval_batch_size = config['eval_batch_size']
29 |
30 | if dataset == "mnist":
31 | from tensorflow.examples.tutorials.mnist import input_data
32 | mnist = input_data.read_data_sets('MNIST_data', one_hot=False)
33 |
34 | if "model_type" in config and config["model_type"] == "linear":
35 | x_train = mnist.train.images
36 | y_train = mnist.train.labels
37 | x_test = mnist.test.images
38 | y_test = mnist.test.labels
39 |
40 | pos_train = (y_train == 5) | (y_train == 7)
41 | x_train = x_train[pos_train]
42 | y_train = y_train[pos_train]
43 | y_train = (y_train == 5).astype(np.int64)
44 | pos_test = (y_test == 5) | (y_test == 7)
45 | x_test = x_test[pos_test]
46 | y_test = y_test[pos_test]
47 | y_test = (y_test == 5).astype(np.int64)
48 |
49 | from tensorflow.contrib.learn.python.learn.datasets.mnist import DataSet
50 | from tensorflow.contrib.learn.python.learn.datasets import base
51 |
52 | options = dict(dtype=tf.uint8, reshape=False, seed=None)
53 | train = DataSet(x_train, y_train, **options)
54 | test = DataSet(x_test, y_test, **options)
55 |
56 | mnist = base.Datasets(train=train, validation=None, test=test)
57 |
58 | else:
59 | import cifar10_input
60 | data_path = config["data_path"]
61 | cifar = cifar10_input.CIFAR10Data(data_path)
62 |
63 | # Iterate over the samples batch-by-batch
64 | num_batches = int(math.ceil(num_eval_examples / eval_batch_size))
65 | all_corr_nat = []
66 | all_corr_adv = []
67 | lps = []
68 |
69 | num_inconsistencies = 0
70 | num_solved_inconsistencies = 0
71 |
72 | pbar = tqdm(total=num_eval_examples)
73 |
74 | for ibatch in range(num_batches):
75 | bstart = ibatch * eval_batch_size
76 | bend = min(bstart + eval_batch_size, num_eval_examples)
77 |
78 | if dataset == "mnist":
79 | x_batch = mnist.test.images[bstart:bend, :].reshape(-1, 28, 28, 1)
80 | y_batch = mnist.test.labels[bstart:bend]
81 | else:
82 | x_batch = cifar.eval_data.xs[bstart:bend, :].astype(np.float32)
83 | y_batch = cifar.eval_data.ys[bstart:bend]
84 |
85 | adversarials = []
86 | preds_adv = []
87 | for x, y in zip(x_batch, y_batch):
88 |
89 | for trial in range(1):
90 | if norm == "l2":
91 | adversarial = attack(x, y, iterations=5000, max_directions=25)
92 | else:
93 | adversarial = attack(x, y)
94 | failed = False
95 | if adversarial is None:
96 | failed = True
97 | adversarial = x
98 |
99 | pred_adv = y
100 | if not failed:
101 | pred_adv = np.argmax(fmodel.predictions(adversarial))
102 | if pred_adv == y:
103 | num_inconsistencies += 1
104 | if verbose:
105 | print("Inconsistency with l2 {:.3f}!".format(np.sqrt(np.sum(np.square(adversarial - x)))))
106 | new_adversarials = np.asarray([x + a * (adversarial - x) for a in [1.001, 1.005, 1.01, 1.05, 1.1]])
107 | new_preds_adv = np.argmax(fmodel.batch_predictions(new_adversarials), axis=-1)
108 |
109 | if ((new_preds_adv == y)).all():
110 | failed = True
111 | adversarial = x
112 | if verbose:
113 | print("Failed to resolve inconsistency!")
114 | else:
115 | adversarial = new_adversarials[np.argmin(new_preds_adv != y)]
116 | pred_adv = new_preds_adv[np.argmin(new_preds_adv != y)]
117 | num_solved_inconsistencies += 1
118 | if verbose:
119 | print("Solved inconsistency")
120 |
121 | if norm == 'l1':
122 | lp = np.sum(np.abs(adversarial - x))
123 | else:
124 | lp = np.sqrt(np.sum(np.square(adversarial - x)))
125 |
126 | if verbose:
127 | print("trial {}".format(trial), lp, failed)
128 |
129 | if lp < bound:
130 | break
131 | lps.append(lp)
132 | adversarials.append(adversarial)
133 | preds_adv.append(pred_adv)
134 | if not verbose:
135 | pbar.update(n=1)
136 |
137 | preds = np.argmax(fmodel.batch_predictions(x_batch), axis=-1)
138 | all_corr_nat.extend(preds == y_batch)
139 | all_corr_adv.extend(preds_adv == y_batch)
140 |
141 | if verbose:
142 | all_corr_adv = np.asarray(all_corr_adv)
143 | all_corr_nat = np.asarray(all_corr_nat)
144 | lps = np.asarray(lps)
145 | print('acc adv w. bound', np.mean(all_corr_adv | ((lps > bound) & all_corr_nat)))
146 |
147 | pbar.close()
148 |
149 | all_corr_adv = np.asarray(all_corr_adv)
150 | all_corr_nat = np.asarray(all_corr_nat)
151 | lps = np.asarray(lps)
152 |
153 | acc_nat = np.mean(all_corr_nat)
154 | acc_adv = np.mean(all_corr_adv)
155 | print('acc_nat', acc_nat)
156 | print('acc_adv', acc_adv)
157 | print('min(lp)={:.2f}, max(lp)={:.2f}, mean(lp)={:.2f}, median(lp)={:.2f}'.format(
158 | np.min(lps), np.max(lps), np.mean(lps), np.median(lps)))
159 | print('acc adv w. bound', np.mean(all_corr_adv | ((lps > bound) & all_corr_nat)))
160 |
161 | print("num_inconsistencies", num_inconsistencies)
162 | print("num_solved_inconsistencies", num_solved_inconsistencies)
163 |
164 | return all_corr_nat, all_corr_adv, lps
165 |
166 |
167 | if __name__ == "__main__":
168 | parser = argparse.ArgumentParser(
169 | description='Eval script options',
170 | formatter_class=argparse.ArgumentDefaultsHelpFormatter)
171 | parser.add_argument('model_dir', type=str,
172 | help='path to model directory')
173 | parser.add_argument('--epoch', type=int, default=None,
174 | help='specific epoch to load (default=latest)')
175 | parser.add_argument('--eval_cpu', help='evaluate on CPU',
176 | action="store_true")
177 | parser.add_argument('--norm', help='norm to use', choices=['l1', 'l2'], default='l1')
178 | parser.add_argument('--bound', type=float, help='Foolbox pointwise attack noise bound', default=None)
179 |
180 | args = parser.parse_args()
181 |
182 | if args.eval_cpu:
183 | os.environ['CUDA_VISIBLE_DEVICES'] = ''
184 |
185 | model_dir = args.model_dir
186 |
187 | with open(model_dir + "/config.json") as config_file:
188 | config = json.load(config_file)
189 |
190 | dataset = config["data"]
191 | if dataset == "mnist":
192 | from model import Model
193 | model = Model(config)
194 |
195 | x_min, x_max = 0.0, 1.0
196 | else:
197 | from cifar10_model import Model
198 | model = Model(config)
199 | x_min, x_max = 0.0, 255.0
200 |
201 | saver = tf.train.Saver()
202 | if args.epoch is not None:
203 | ckpts = tf.train.get_checkpoint_state(model_dir).all_model_checkpoint_paths
204 | ckpt = [c for c in ckpts if c.endswith('checkpoint-{}'.format(args.epoch))]
205 | assert len(ckpt) == 1
206 | cur_checkpoint = ckpt[0]
207 | else:
208 | cur_checkpoint = tf.train.latest_checkpoint(model_dir)
209 |
210 | assert cur_checkpoint is not None
211 |
212 | config_tf = tf.ConfigProto()
213 | config_tf.gpu_options.allow_growth = True
214 | if dataset == "mnist":
215 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.1
216 | else:
217 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.1
218 |
219 | with tf.Session(config=config_tf) as sess:
220 | # Restore the checkpoint
221 | print('Evaluating checkpoint {}'.format(cur_checkpoint))
222 | saver.restore(sess, cur_checkpoint)
223 |
224 | evaluate_fb(model, config, x_min, x_max, args.norm, args.bound)
225 |
226 |
--------------------------------------------------------------------------------
/model.py:
--------------------------------------------------------------------------------
1 | """
2 | The model is adapted from the tensorflow tutorial:
3 | https://www.tensorflow.org/get_started/mnist/pros
4 | """
5 | from __future__ import absolute_import
6 | from __future__ import division
7 | from __future__ import print_function
8 |
9 | import tensorflow as tf
10 | import numpy as np
11 |
12 |
13 | class Model(object):
14 | def __init__(self, config):
15 | assert config["model_type"] in ["cnn", "linear"]
16 | self.is_training = tf.placeholder(tf.bool)
17 | self.x_input = tf.placeholder(tf.float32, shape = [None, 28, 28, 1])
18 | self.y_input = tf.placeholder(tf.int64, shape = [None])
19 |
20 | self.transform = tf.placeholder_with_default(tf.zeros((tf.shape(self.x_input)[0], 3)), shape=[None, 3])
21 | trans_x, trans_y, rot = tf.unstack(self.transform, axis=1)
22 | rot *= np.pi / 180 # convert degrees to radians
23 |
24 | x = self.x_input
25 |
26 | #rotate and translate image
27 | ones = tf.ones(shape=tf.shape(trans_x))
28 | zeros = tf.zeros(shape=tf.shape(trans_x))
29 | trans = tf.stack([ones, zeros, -trans_x,
30 | zeros, ones, -trans_y,
31 | zeros, zeros], axis=1)
32 | x = tf.contrib.image.rotate(x, rot, interpolation='BILINEAR')
33 | x = tf.contrib.image.transform(x, trans, interpolation='BILINEAR')
34 | self.x_image = x
35 |
36 | ch = 1
37 |
38 | if config["model_type"] == "cnn":
39 | x.set_shape((None, 28, 28, 1))
40 | x = tf.layers.conv2d(x, 32, (5, 5), activation='relu', padding='same', name='conv1')
41 | x = tf.layers.max_pooling2d(x, (2, 2), (2, 2), padding='same')
42 | x = tf.layers.conv2d(x, 64, (5, 5), activation='relu', padding='same', name='conv2')
43 | x = tf.layers.max_pooling2d(x, (2, 2), (2, 2), padding='same')
44 |
45 | x = tf.layers.flatten(x)
46 | #x = tf.layers.flatten(tf.transpose(x, (0, 3, 1, 2)))
47 | x = tf.layers.dense(x, 1024, activation='relu', name='fc1')
48 | self.pre_softmax = tf.layers.dense(x, 10, name='fc2')
49 | else:
50 | W_fc = self._weight_variable([784*ch, 2])
51 | b_fc = self._bias_variable([2])
52 | self.W = W_fc
53 | self.b = b_fc
54 | x_flat = tf.reshape(x, [-1, 784*ch])
55 | self.pre_softmax = tf.matmul(x_flat, W_fc) + b_fc
56 |
57 | self.y_xent = tf.nn.sparse_softmax_cross_entropy_with_logits(
58 | labels=self.y_input, logits=self.pre_softmax)
59 |
60 | self.xent = tf.reduce_sum(self.y_xent)
61 | self.mean_xent = tf.reduce_mean(self.y_xent)
62 |
63 | self.y_pred = tf.argmax(self.pre_softmax, 1)
64 |
65 | self.correct_prediction = tf.equal(self.y_pred, self.y_input)
66 |
67 | self.num_correct = tf.reduce_sum(tf.cast(self.correct_prediction, tf.int64))
68 | self.accuracy = tf.reduce_mean(tf.cast(self.correct_prediction, tf.float32))
69 |
70 | @staticmethod
71 | def _weight_variable(shape):
72 | initial = tf.truncated_normal(shape, stddev=0.1)
73 | return tf.Variable(initial)
74 |
75 | @staticmethod
76 | def _bias_variable(shape):
77 | initial = tf.constant(0.1, shape = shape)
78 | return tf.Variable(initial)
79 |
80 | @staticmethod
81 | def _conv2d(x, W):
82 | return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding='SAME')
83 |
84 | @staticmethod
85 | def _max_pool_2x2( x):
86 | return tf.nn.max_pool(x,
87 | ksize = [1,2,2,1],
88 | strides=[1,2,2,1],
89 | padding='SAME')
90 |
--------------------------------------------------------------------------------
/pgd_attack.py:
--------------------------------------------------------------------------------
1 | """
2 | Implementation of attack methods. Running this file as a program will
3 | apply the attack to the model specified by the config file and store
4 | the examples in an .npy file.
5 | """
6 | from __future__ import absolute_import
7 | from __future__ import division
8 | from __future__ import print_function
9 |
10 | import tensorflow as tf
11 | import numpy as np
12 | from itertools import product
13 | from collections import Counter
14 | import json
15 |
16 |
17 | def uniform_weights(n_attacks, n_samples):
18 | x = np.random.uniform(size=(n_attacks, n_samples))
19 | y = np.maximum(-np.log(x), 1e-8)
20 | return y / np.sum(y, axis=0, keepdims=True)
21 |
22 |
23 | def init_delta(x, attack, weight):
24 | if not attack["random_start"]:
25 | return np.zeros_like(x)
26 |
27 | assert len(weight) == len(x)
28 | eps = (attack["epsilon"] * weight).reshape(len(x), 1, 1, 1)
29 |
30 | if attack["type"] == "linf":
31 | return np.random.uniform(-eps, eps, x.shape)
32 | elif attack["type"] == "l2":
33 | r = np.random.randn(*x.shape)
34 | norm = np.linalg.norm(r.reshape(r.shape[0], -1), axis=-1).reshape(-1, 1, 1, 1)
35 | return (r / norm) * eps
36 | elif attack["type"] == "l1":
37 | r = np.random.laplace(size=x.shape)
38 | norm = np.linalg.norm(r.reshape(r.shape[0], -1), axis=-1, ord=1).reshape(-1, 1, 1, 1)
39 | return (r / norm) * eps
40 | else:
41 | raise ValueError("Unknown norm {}".format(attack["type"]))
42 |
43 |
44 | def delta_update(old_delta, g, x_adv, attack, x_min, x_max, weight, seed=None, t=None):
45 | assert len(weight) == len(x_adv)
46 |
47 | eps_w = attack["epsilon"] * weight
48 | eps = eps_w.reshape(len(x_adv), 1, 1, 1)
49 |
50 | if attack["type"] == "linf":
51 | a = attack.get('a', (2.5 * eps) / attack["k"])
52 | new_delta = old_delta + a * np.sign(g)
53 | new_delta = np.clip(new_delta, -eps, eps)
54 |
55 | new_delta = np.clip(new_delta, x_min - (x_adv - old_delta), x_max - (x_adv - old_delta))
56 | return new_delta
57 |
58 | elif attack["type"] == "l2":
59 | a = attack.get('a', (2.5 * eps) / attack["k"])
60 | bad_pos = ((x_adv == x_max) & (g > 0)) | ((x_adv == x_min) & (g < 0))
61 | g[bad_pos] = 0
62 |
63 | g = g.reshape(len(g), -1)
64 | g /= np.maximum(np.linalg.norm(g, axis=-1, keepdims=True), 1e-8)
65 | g = g.reshape(old_delta.shape)
66 |
67 | new_delta = old_delta + a * g
68 | new_delta_norm = np.linalg.norm(new_delta.reshape(len(new_delta), -1), axis=-1).reshape(-1, 1, 1, 1)
69 | new_delta = new_delta / np.maximum(new_delta_norm, 1e-8) * np.minimum(new_delta_norm, eps)
70 | new_delta = np.clip(new_delta, x_min - (x_adv - old_delta), x_max - (x_adv - old_delta))
71 | return new_delta
72 |
73 | elif attack["type"] == "l1":
74 | _, h, w, ch = g.shape
75 |
76 | a = attack.get('a', 1.0) * x_max
77 | perc = attack.get('perc', 99)
78 |
79 | if perc == 'max':
80 | bad_pos = ((x_adv > (x_max - a)) & (g > 0)) | ((x_adv < a) & (g < 0)) | (x_adv < x_min) | (x_adv > x_max)
81 | g[bad_pos] = 0
82 | else:
83 | bad_pos = ((x_adv == x_max) & (g > 0)) | ((x_adv == x_min) & (g < 0))
84 | g[bad_pos] = 0
85 |
86 | abs_grad = np.abs(g)
87 | sign = np.sign(g)
88 |
89 | if perc == 'max':
90 | grad_flat = abs_grad.reshape(len(abs_grad), -1)
91 | max_abs_grad = np.argmax(grad_flat, axis=-1)
92 | optimal_perturbation = np.zeros_like(grad_flat)
93 | optimal_perturbation[np.arange(len(grad_flat)), max_abs_grad] = 1.0
94 | optimal_perturbation = sign * optimal_perturbation.reshape(abs_grad.shape)
95 | else:
96 | if isinstance(perc, list):
97 | perc_low, perc_high = perc
98 | perc = np.random.RandomState(seed).uniform(low=perc_low, high=perc_high)
99 |
100 | max_abs_grad = np.percentile(abs_grad, perc, axis=(1, 2, 3), keepdims=True)
101 | tied_for_max = (abs_grad >= max_abs_grad).astype(np.float32)
102 | num_ties = np.sum(tied_for_max, (1, 2, 3), keepdims=True)
103 | optimal_perturbation = sign * tied_for_max / num_ties
104 |
105 | new_delta = old_delta + a * optimal_perturbation
106 |
107 | l1 = np.sum(np.abs(new_delta), axis=(1, 2, 3))
108 | to_project = l1 > eps_w
109 | if np.any(to_project):
110 | n = np.sum(to_project)
111 | d = new_delta[to_project].reshape(n, -1) # n * N (N=h*w*ch)
112 | abs_d = np.abs(d) # n * N
113 | mu = -np.sort(-abs_d, axis=-1) # n * N
114 | cumsums = mu.cumsum(axis=-1) # n * N
115 | eps_d = eps_w[to_project]
116 | js = 1.0 / np.arange(1, h * w * ch + 1)
117 | temp = mu - js * (cumsums - np.expand_dims(eps_d, -1))
118 | rho = np.argmin(temp > 0, axis=-1)
119 | theta = 1.0 / (1 + rho) * (cumsums[range(n), rho] - eps_d)
120 | sgn = np.sign(d)
121 | d = sgn * np.maximum(abs_d - np.expand_dims(theta, -1), 0)
122 | new_delta[to_project] = d.reshape(-1, h, w, ch)
123 |
124 | new_delta = np.clip(new_delta, x_min - (x_adv - old_delta), x_max - (x_adv - old_delta))
125 | return new_delta
126 |
127 |
128 | def compute_grad(model):
129 | label_mask = tf.one_hot(model.y_input,
130 | model.pre_softmax.get_shape().as_list()[-1],
131 | on_value=1.0,
132 | off_value=0.0,
133 | dtype=tf.float32)
134 | correct_logit = tf.reduce_sum(label_mask * model.pre_softmax, axis=1)
135 | wrong_logit = tf.reduce_max((1 - label_mask) * model.pre_softmax - 1e4 * label_mask, axis=1)
136 | loss = -(correct_logit - wrong_logit)
137 | return tf.gradients(loss, model.x_input)[0]
138 |
139 |
140 | def name(attack):
141 | return json.dumps(attack)
142 |
143 |
144 | class PGDAttack:
145 | def __init__(self, model, attack_config, x_min, x_max, grad, reps=1):
146 | """Attack parameter initialization. The attack performs k steps of
147 | size a, while always staying within epsilon from the initial
148 | point."""
149 | print("new attack: ", attack_config)
150 | if isinstance(attack_config, dict):
151 | attack_config = [attack_config]
152 |
153 | self.model = model
154 | self.x_min = x_min
155 | self.x_max = x_max
156 | self.attack_config = attack_config
157 | self.names = [name(a) for a in attack_config]
158 | self.name = " - ".join(self.names)
159 | self.grad = grad
160 | self.reps = int(attack_config[0].get("reps", 1))
161 | assert self.reps >= 1
162 |
163 | def perturb(self, x_nat, y, sess, x_nat_no_aug=None):
164 |
165 | if len(self.attack_config) == 0:
166 | return x_nat, None
167 |
168 | if x_nat_no_aug is None:
169 | x_nat_no_aug = x_nat
170 |
171 | n = len(x_nat)
172 | worst_x = np.copy(x_nat)
173 | worst_t = np.zeros([n, 3])
174 | max_xent = np.zeros(n)
175 | all_correct = np.ones(n).astype(bool)
176 |
177 | for i in range(self.reps):
178 | if "weight" in self.attack_config[0]:
179 | weights = np.asarray([a["weight"] for a in self.attack_config])
180 | weights = np.repeat(weights[:, np.newaxis], len(x_nat), axis=-1)
181 | else:
182 | weights = uniform_weights(len(self.attack_config), len(x_nat))
183 |
184 | if self.attack_config[0]["type"] == "RT":
185 | assert np.all([a["type"] != "RT" for a in self.attack_config[1:]])
186 | norm_attacks = self.attack_config[1:]
187 | norm_weights = weights[1:]
188 | x_adv, trans = self.grid_perturb(x_nat_no_aug, y, sess, self.attack_config[0],
189 | weights[0], norm_attacks, norm_weights)
190 | else:
191 | # rotation and translation attack should always come first
192 | assert np.all([a["type"] != "RT" for a in self.attack_config])
193 | norm_attacks = self.attack_config
194 | x_adv = self.norm_perturb(x_nat, y, sess, norm_attacks, weights)
195 | trans = worst_t
196 |
197 | cur_xent, cur_correct = sess.run([self.model.y_xent, self.model.correct_prediction],
198 | feed_dict={self.model.x_input: x_adv,
199 | self.model.y_input: y,
200 | self.model.is_training: False,
201 | self.model.transform: trans})
202 | cur_xent = np.asarray(cur_xent)
203 | cur_correct = np.asarray(cur_correct)
204 |
205 | idx = (cur_xent > max_xent) & (cur_correct == all_correct)
206 | idx = idx | (cur_correct < all_correct)
207 | max_xent = np.maximum(cur_xent, max_xent)
208 | all_correct = cur_correct & all_correct
209 |
210 | idx = np.expand_dims(idx, axis=-1) # shape (bsize, 1)
211 | worst_t = np.where(idx, trans, worst_t) # shape (bsize, 3)
212 |
213 | idx = np.expand_dims(idx, axis=-1)
214 | idx = np.expand_dims(idx, axis=-1) # shape (bsize, 1, 1, 1)
215 | worst_x = np.where(idx, x_adv, worst_x, ) # shape (bsize, h, w, ch)
216 |
217 | return worst_x, worst_t
218 |
219 | def grid_perturb(self, x_nat, y, sess, attack_config, weight, norm_attacks, norm_weights):
220 | random_tries = attack_config["random_tries"]
221 | n = len(x_nat)
222 |
223 | assert len(weight) == len(x_nat)
224 | # (3, 1) * n => (3, n)
225 | spatial_limits = np.asarray(attack_config["spatial_limits"])[:, np.newaxis] * weight
226 |
227 | if random_tries > 0:
228 | grids = np.zeros((n, random_tries))
229 | else:
230 | # exhaustive grid
231 | # n * (num_x * num_y * num_rot)
232 | grids = [list(product(*list(np.linspace(-l, l, num=g)
233 | for l, g in zip(spatial_limits[:, i], attack_config["grid_granularity"]))))
234 | for i in range(len(x_nat))]
235 | grids = np.asarray(grids)
236 |
237 | worst_x = np.copy(x_nat)
238 | worst_t = np.zeros([n, 3])
239 | max_xent = np.zeros(n)
240 | all_correct = np.ones(n).astype(bool)
241 |
242 | for idx in range(len(grids[0])):
243 | if random_tries > 0:
244 | t = [[np.random.uniform(-l, l) for l in spatial_limits[:, i]] for i in range(len(x_nat))]
245 | else:
246 | t = grids[:, idx]
247 |
248 | x = self.norm_perturb(x_nat, y, sess, norm_attacks, norm_weights, trans=t)
249 |
250 | curr_dict = {self.model.x_input: x,
251 | self.model.y_input: y,
252 | self.model.is_training: False,
253 | self.model.transform: t}
254 |
255 | cur_xent, cur_correct = sess.run([self.model.y_xent,
256 | self.model.correct_prediction],
257 | feed_dict=curr_dict) # shape (bsize,)
258 | cur_xent = np.asarray(cur_xent)
259 | cur_correct = np.asarray(cur_correct)
260 |
261 | # Select indices to update: we choose the misclassified transformation
262 | # of maximum xent (or just highest xent if everything else if correct).
263 | idx = (cur_xent > max_xent) & (cur_correct == all_correct)
264 | idx = idx | (cur_correct < all_correct)
265 | max_xent = np.maximum(cur_xent, max_xent)
266 | all_correct = cur_correct & all_correct
267 |
268 | idx = np.expand_dims(idx, axis=-1) # shape (bsize, 1)
269 | worst_t = np.where(idx, t, worst_t) # shape (bsize, 3)
270 |
271 | idx = np.expand_dims(idx, axis=-1)
272 | idx = np.expand_dims(idx, axis=-1) # shape (bsize, 1, 1, 1)
273 | worst_x = np.where(idx, x, worst_x, ) # shape (bsize, h, w, ch)
274 |
275 | return worst_x, worst_t
276 |
277 | def norm_perturb(self, x_nat, y, sess, norm_attacks, norm_weights, trans=None):
278 | if len(norm_attacks) == 0:
279 | return x_nat
280 |
281 | x_min = self.x_min
282 | x_max = self.x_max
283 |
284 | if trans is None:
285 | trans = np.zeros([len(x_nat), 3])
286 |
287 | iters = [a["k"] for a in norm_attacks]
288 | assert (np.all(np.asarray(iters) == iters[0]))
289 |
290 | deltas = np.asarray([init_delta(x_nat, attack, weight)
291 | for attack, weight in zip(norm_attacks, norm_weights)])
292 | x_adv = np.clip(x_nat + np.sum(deltas, axis=0), 0, 1)
293 |
294 |
295 | # a seed that remains constant across attack iterations
296 | seed = np.random.randint(low=0, high=2**32-1)
297 |
298 | for i in range(np.sum(iters)):
299 | grad = sess.run(self.grad, feed_dict={self.model.x_input: x_adv,
300 | self.model.y_input: y,
301 | self.model.is_training: False,
302 | self.model.transform: trans})
303 |
304 | deltas[i % len(norm_attacks)] = delta_update(deltas[i % len(norm_attacks)],
305 | grad,
306 | x_adv,
307 | norm_attacks[i % len(norm_attacks)],
308 | x_min, x_max,
309 | norm_weights[i % len(norm_attacks)],
310 | seed=seed, t=i+1)
311 |
312 | x_adv = np.clip(x_nat + np.sum(deltas, axis=0), x_min, x_max)
313 |
314 | return np.clip(x_nat + np.sum(deltas, axis=0), x_min, x_max)
315 |
--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | absl-py==0.6.1
2 | astor==0.7.1
3 | backports-abc==0.5
4 | backports.functools-lru-cache==1.5
5 | backports.shutil-get-terminal-size==1.0.0
6 | backports.weakref==1.0.post1
7 | bleach==3.1.0
8 | certifi==2018.11.29
9 | chardet==3.0.4
10 | cleverhans==3.0.1
11 | configparser==3.7.1
12 | cycler==0.10.0
13 | Cython==0.29.5
14 | decorator==4.3.2
15 | defusedxml==0.5.0
16 | entrypoints==0.3
17 | enum34==1.1.6
18 | foolbox==1.8.0
19 | funcsigs==1.0.2
20 | functools32==3.2.3.post2
21 | future==0.17.1
22 | futures==3.2.0
23 | gast==0.2.0
24 | gitdb2==2.0.5
25 | GitPython==2.1.11
26 | grpcio==1.17.1
27 | h5py==2.8.0
28 | idna==2.8
29 | ipaddress==1.0.22
30 | ipykernel==4.10.0
31 | ipython==5.8.0
32 | ipython-genutils==0.2.0
33 | ipywidgets==7.4.2
34 | Jinja2==2.10
35 | joblib==0.13.2
36 | jsonschema==2.6.0
37 | jupyter==1.0.0
38 | jupyter-client==5.2.4
39 | jupyter-console==5.2.0
40 | jupyter-core==4.4.0
41 | Keras==2.2.4
42 | Keras-Applications==1.0.6
43 | Keras-Preprocessing==1.0.5
44 | kiwisolver==1.0.1
45 | Markdown==3.0.1
46 | MarkupSafe==1.1.0
47 | matplotlib==2.2.3
48 | mistune==0.8.4
49 | mmdnn==0.2.5
50 | mnist==0.2.2
51 | mock==2.0.0
52 | nbconvert==5.4.1
53 | nbformat==4.4.0
54 | nose==1.3.7
55 | notebook==5.7.8
56 | numpy==1.16.1
57 | pandas==0.24.2
58 | pandocfilters==1.4.2
59 | pathlib2==2.3.3
60 | pbr==5.1.1
61 | pexpect==4.6.0
62 | pickleshare==0.7.5
63 | Pillow==5.4.1
64 | prometheus-client==0.5.0
65 | prompt-toolkit==1.0.15
66 | protobuf==3.6.1
67 | ptyprocess==0.6.0
68 | pycodestyle==2.5.0
69 | Pygments==2.3.1
70 | pyparsing==2.3.0
71 | python-dateutil==2.8.0
72 | pytz==2018.7
73 | PyYAML==5.1
74 | pyzmq==17.1.2
75 | qtconsole==4.4.3
76 | randomgen==1.16.0
77 | requests==2.21.0
78 | scandir==1.9.0
79 | scipy==1.2.0
80 | Send2Trash==1.5.0
81 | simplegeneric==0.8.1
82 | singledispatch==3.4.0.3
83 | six==1.12.0
84 | smmap2==2.0.5
85 | subprocess32==3.5.3
86 | tensorboard==1.12.1
87 | tensorflow-gpu==1.12.0
88 | tensorflow-probability==0.5.0
89 | termcolor==1.1.0
90 | terminado==0.8.1
91 | testpath==0.4.2
92 | torch==1.0.1.post2
93 | torchvision==0.2.1
94 | tornado==5.1.1
95 | tqdm==4.31.1
96 | traitlets==4.3.2
97 | urllib3==1.24.2
98 | wcwidth==0.1.7
99 | webencodings==0.5.1
100 | Werkzeug==0.15.3
101 | widgetsnbextension==3.4.2
102 |
--------------------------------------------------------------------------------
/scripts/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/ftramer/MultiRobustness/f51a75e07f06b010f34ee760d80fea05ba8ba785/scripts/__init__.py
--------------------------------------------------------------------------------
/scripts/eval_cifar_lps.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | from pgd_attack import PGDAttack, compute_grad
4 | from cifar10_model import Model
5 | from scripts.utils import get_ckpt
6 | from eval_ch import evaluate_ch, get_model, get_saver
7 | from eval_fb import evaluate_fb
8 | from eval import evaluate
9 | from multiprocessing import Pool
10 | import sys
11 |
12 |
13 | models_slim = [
14 | ]
15 |
16 | models_wide = [
17 | ('path_to_model', -1),
18 | ]
19 |
20 | attack_configs = [
21 | {"type": "linf", "epsilon": 4.0, "k": 100, "random_start": True, "reps": 20},
22 | {"type": "linf", "epsilon": 4.0, "k": 1000, "random_start": True},
23 | {"type": "l1", "epsilon": 2000, "k": 100, "random_start": True, "perc": 99, "a": 2.0, "reps": 20},
24 | {"type": "l1", "epsilon": 2000, "k": 1000, "random_start": True, "perc": 99, "a": 2.0}
25 | ]
26 |
27 | outdir = "cifar_" + str(int(attack_configs[0]["epsilon"]))
28 |
29 | eval_config = {"data": "cifar10",
30 | "data_path": "cifar10_data",
31 | "num_eval_examples": 1000,
32 | "eval_batch_size": 100}
33 |
34 | eval_wide = sys.argv[1] == "wide"
35 |
36 | if eval_wide:
37 | models = models_wide
38 | eval_config["filters"] = [16, 160, 320, 640]
39 | else:
40 | models = models_slim
41 | eval_config["filters"] = [16, 16, 32, 64]
42 |
43 | model = Model(eval_config)
44 | grad = compute_grad(model)
45 | attacks = [PGDAttack(model, a_config, 0.0, 255.0, grad) for a_config in attack_configs]
46 |
47 | saver = tf.train.Saver()
48 | config_tf = tf.ConfigProto()
49 | config_tf.gpu_options.allow_growth = True
50 | config_tf.gpu_options.per_process_gpu_memory_fraction = 1.0
51 |
52 | nat_accs = np.zeros(len(models))
53 | adv_accs = np.zeros((len(models), len(attacks) + 2))
54 |
55 | any_attack = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
56 | any_l1 = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
57 | any_linf = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
58 |
59 | def worker((model_dir, epoch)):
60 |
61 | model_name = model_dir.split('/')[-1] + "_" + str(epoch)
62 | output_file = "results/{}/lps/{}_l1_fb.npy".format(outdir, model_name)
63 |
64 | try:
65 | all_corr_adv1 = np.load(output_file)
66 | return all_corr_adv1
67 | except:
68 |
69 | g = tf.Graph()
70 | with g.as_default():
71 | config_tf = tf.ConfigProto()
72 | config_tf.gpu_options.allow_growth = True
73 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.2
74 | with tf.Session(graph=g, config=config_tf) as sess:
75 | model = Model(eval_config)
76 |
77 | saver = tf.train.Saver()
78 | ckpt = get_ckpt(model_dir, epoch)
79 | print("loading ", ckpt)
80 | saver.restore(sess, ckpt)
81 |
82 | # FB attacks
83 | print("Foolbox l1 attack")
84 | all_corr_nat1, all_corr_adv1, l1s = evaluate_fb(model, eval_config, 0.0, 255.0, norm='l1', bound=2000, verbose=False)
85 | all_corr_adv1 = all_corr_adv1 | ((l1s > 2000) & all_corr_nat1)
86 |
87 | np.save(output_file, all_corr_adv1)
88 |
89 | return all_corr_adv1
90 |
91 | pool = Pool(max(len(models), 4))
92 | all_models_corr_adv1 = pool.map(worker, models)
93 | pool.close()
94 | pool.join()
95 |
96 | adv_accs[:, len(attacks)] = np.mean(all_models_corr_adv1, axis=-1)
97 | any_attack &= all_models_corr_adv1
98 | any_l1 &= all_models_corr_adv1
99 |
100 | print("DONE with FB!")
101 |
102 | print(nat_accs)
103 | print(adv_accs)
104 | print("any: ", np.mean(any_attack, axis=-1))
105 | print("l1: ", np.mean(any_l1, axis=-1))
106 | print("linf: ", np.mean(any_linf, axis=-1))
107 |
108 | with tf.Session(config=config_tf) as sess:
109 | for m_idx, (model_dir, epoch) in enumerate(models):
110 | ckpt = get_ckpt(model_dir, epoch)
111 | saver.restore(sess, ckpt)
112 |
113 | print("starting...", model_dir, epoch)
114 |
115 | # lp attacks
116 | nat_acc, total_corr_advs = evaluate(model, attacks, sess, eval_config)
117 | nat_accs[m_idx] = nat_acc
118 | adv_acc = np.mean(total_corr_advs, axis=-1)
119 | adv_accs[m_idx, :len(attacks)] = adv_acc
120 | any_attack[m_idx] &= np.bitwise_and.reduce(np.asarray(total_corr_advs), 0)
121 |
122 | print(model_dir, adv_accs[m_idx])
123 | model_name = models[m_idx][0].split('/')[-1] + "_" + str(models[m_idx][1])
124 | for i, attack in enumerate(attacks):
125 | np.save("results/{}/lps/{}_{}.npy".format(outdir, model_name, attack.name), total_corr_advs[i])
126 |
127 | if attack_configs[i]["type"] == "l1":
128 | any_l1[m_idx] &= total_corr_advs[i]
129 | else:
130 | any_linf[m_idx] &= total_corr_advs[i]
131 |
132 | print("DONE with PGD!")
133 | print(nat_accs)
134 | print(adv_accs)
135 | print("any: ", np.mean(any_attack, axis=-1))
136 | print("l1: ", np.mean(any_l1, axis=-1))
137 | print("linf: ", np.mean(any_linf, axis=-1))
138 |
139 | tf.reset_default_graph()
140 |
141 | # Cleverhans attacks
142 | g2 = tf.Graph()
143 | with g2.as_default():
144 | with tf.Session(graph=g2, config=config_tf) as sess2:
145 | model2 = get_model(eval_config)
146 | saver2 = get_saver(eval_config)
147 |
148 | for m_idx, (model_dir, epoch) in enumerate(models):
149 | ckpt = get_ckpt(model_dir, epoch)
150 | saver.restore(sess2, ckpt)
151 |
152 | print("starting...", model_dir, epoch)
153 |
154 | print("EAD")
155 | all_corr_nat1, all_corr_adv1, l1s = evaluate_ch(model2, eval_config, sess2, "l1", 2000, verbose=True)
156 | all_corr_adv1 = all_corr_adv1 | ((l1s > 2000) & all_corr_nat1)
157 | adv_accs[m_idx, len(attacks) + 1] = np.mean(all_corr_adv1)
158 | any_attack[m_idx] &= all_corr_adv1
159 |
160 | print(model_dir, adv_accs[m_idx])
161 |
162 | model_name = models[m_idx][0].split('/')[-1] + "_" + str(models[m_idx][1])
163 | any_l1[m_idx] &= all_corr_adv1
164 | np.save("results/{}/lps/{}_l1_ead.npy".format(outdir, model_name), all_corr_adv1)
165 |
166 | print(nat_accs)
167 | print(adv_accs)
168 | print("any: ", np.mean(any_attack, axis=-1))
169 | print("l1: ", np.mean(any_l1, axis=-1))
170 | print("linf: ", np.mean(any_linf, axis=-1))
171 |
172 |
--------------------------------------------------------------------------------
/scripts/eval_cifar_spatial.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | from pgd_attack import PGDAttack, compute_grad
4 | from cifar10_model import Model
5 | from scripts.utils import get_ckpt
6 | from eval import evaluate
7 | import sys
8 |
9 |
10 | models_slim = [
11 | ]
12 |
13 | models_wide = [
14 | ('path_to_model', -1),
15 | ]
16 |
17 | attack_configs = [
18 | {"type": "linf", "epsilon": 4.0, "k": 100, "random_start": True, "reps": 20},
19 | {"type": "linf", "epsilon": 4.0, "k": 1000, "random_start": True},
20 | {"type": "RT", "spatial_limits": [3, 3, 30], "grid_granularity": [5, 5, 31], "random_tries": 10},
21 | {"type": "RT", "spatial_limits": [3, 3, 30], "grid_granularity": [5, 5, 31], "random_tries": -1}
22 | ]
23 |
24 | outdir = "cifar_" + str(int(attack_configs[0]["epsilon"]))
25 |
26 | conf_slim = {"filters": [16, 16, 32, 64]}
27 | conf_wide = {"filters": [16, 160, 320, 640]}
28 |
29 | eval_wide = sys.argv[1] == "wide"
30 |
31 | if eval_wide:
32 | models = models_wide
33 | conf = conf_wide
34 | else:
35 | models = models_slim
36 | conf = conf_slim
37 |
38 | model = Model(conf)
39 | grad = compute_grad(model)
40 | attacks = [PGDAttack(model, a_config, 0.0, 255.0, grad) for a_config in attack_configs]
41 |
42 | saver = tf.train.Saver()
43 | config_tf = tf.ConfigProto()
44 | config_tf.gpu_options.allow_growth = True
45 | config_tf.gpu_options.per_process_gpu_memory_fraction = 1.0
46 |
47 | eval_config = {"data": "cifar10",
48 | "data_path": "cifar10_data",
49 | "num_eval_examples": 1000,
50 | "eval_batch_size": 100}
51 |
52 | nat_accs = np.zeros(len(models))
53 | adv_accs = np.zeros((len(models), len(attacks)))
54 |
55 | any_attack = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
56 | any_rt = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
57 | any_linf = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
58 |
59 | with tf.Session(config=config_tf) as sess:
60 | for m_idx, (model_dir, epoch) in enumerate(models):
61 | ckpt = get_ckpt(model_dir, epoch)
62 | saver.restore(sess, ckpt)
63 |
64 | print("starting...", model_dir, epoch)
65 |
66 | # lp attacks
67 | nat_acc, total_corr_advs = evaluate(model, attacks, sess, eval_config)
68 | nat_accs[m_idx] = nat_acc
69 | adv_acc = np.mean(total_corr_advs, axis=-1)
70 | adv_accs[m_idx, :len(attacks)] = adv_acc
71 | any_attack[m_idx] &= np.bitwise_and.reduce(np.asarray(total_corr_advs), 0)
72 |
73 | print(model_dir, adv_accs[m_idx])
74 | for i, attack in enumerate(attacks):
75 | model_name = models[m_idx][0].split('/')[-1] + "_" + str(models[m_idx][1])
76 | np.save("results/{}/spatial/{}_{}.npy".format(outdir, model_name, attack.name), total_corr_advs[i])
77 |
78 | if attack_configs[i]["type"] == "RT":
79 | any_rt[m_idx] &= total_corr_advs[i]
80 | else:
81 | any_linf[m_idx] &= total_corr_advs[i]
82 |
83 | print(nat_accs)
84 | print(adv_accs)
85 | print("any: ", np.mean(any_attack, axis=-1))
86 | print("rt: ", np.mean(any_rt, axis=-1))
87 | print("linf: ", np.mean(any_linf, axis=-1))
88 |
89 |
--------------------------------------------------------------------------------
/scripts/eval_mnist_lps.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | from pgd_attack import PGDAttack, compute_grad
4 | from model import Model
5 | from scripts.utils import get_ckpt
6 | from eval import evaluate
7 | from eval_fb import evaluate_fb
8 | from eval_ch import evaluate_ch, get_model, get_saver
9 | from eval_bapp import evaluate_bapp
10 | from multiprocessing import Pool
11 | import os
12 |
13 |
14 | models = [
15 | ('path_to_model', -1),
16 | ]
17 |
18 | attack_configs = [
19 | {"type": "linf", "epsilon": 0.3, "k": 100, "random_start": True, "reps": 40},
20 | {"type": "l1", "epsilon": 10, "k": 100, "random_start": True, "perc": 99, "a": 0.5, "reps": 40},
21 | {"type": "l2", "epsilon": 2, "k": 100, "random_start": True, "reps": 40},
22 | ]
23 |
24 | model = Model({"model_type": "cnn"})
25 | grad = compute_grad(model)
26 | attacks = [PGDAttack(model, a_config, 0.0, 1.0, grad) for a_config in attack_configs]
27 |
28 | saver = tf.train.Saver()
29 | config_tf = tf.ConfigProto()
30 | config_tf.gpu_options.allow_growth = True
31 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.5
32 |
33 |
34 | eval_config = {"data": "mnist",
35 | "num_eval_examples": 200,
36 | "eval_batch_size": 200}
37 |
38 | nat_accs = np.zeros(len(models))
39 | adv_accs = np.zeros((len(models), len(attacks) + 5))
40 |
41 | any_attack = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
42 | any_l1 = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
43 | any_l2 = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
44 | any_linf = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
45 |
46 | def worker((model_dir, epoch)):
47 | g = tf.Graph()
48 | with g.as_default():
49 | config_tf = tf.ConfigProto()
50 | config_tf.gpu_options.allow_growth = True
51 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.2
52 | with tf.Session(graph=g, config=config_tf) as sess:
53 | model = Model({"model_type": "cnn"})
54 |
55 | saver = tf.train.Saver()
56 | ckpt = get_ckpt(model_dir, epoch)
57 | saver.restore(sess, ckpt)
58 |
59 | # FB attacks
60 | print("Foolbox l1 attack")
61 | all_corr_nat1, all_corr_adv1, l1s = evaluate_fb(model, eval_config, 0.0, 1.0, norm='l1', bound=10, verbose=False)
62 | all_corr_adv1 = all_corr_adv1 | ((l1s > 10) & all_corr_nat1)
63 |
64 | print("Foolbox l2 attack")
65 | all_corr_nat2, all_corr_adv2, l2s = evaluate_fb(model, eval_config, 0.0, 1.0, norm='l2', bound=2.0, verbose=False)
66 | all_corr_adv2 = all_corr_adv2 | ((l2s > 2.0) & all_corr_nat2)
67 | return all_corr_adv1, all_corr_adv2
68 |
69 |
70 | pool = Pool(4)
71 | all_models_corr_adv = pool.map(worker, models)
72 |
73 | all_models_corr_adv1 = np.asarray([a[0] for a in all_models_corr_adv])
74 | all_models_corr_adv2 = np.asarray([a[1] for a in all_models_corr_adv])
75 |
76 | adv_accs[:, len(attacks)] = np.mean(all_models_corr_adv1, axis=-1)
77 | any_attack &= all_models_corr_adv1
78 | any_l1 &= all_models_corr_adv1
79 | adv_accs[:, len(attacks) + 1] = np.mean(all_models_corr_adv2, axis=-1)
80 | any_attack &= all_models_corr_adv2
81 | any_l2 &= all_models_corr_adv2
82 |
83 | print("DONE with FB!")
84 |
85 | print(nat_accs)
86 | print(adv_accs)
87 | print("any: ", np.mean(any_attack, axis=-1))
88 | print("l1: ", np.mean(any_l1, axis=-1))
89 | print("l2: ", np.mean(any_l2, axis=-1))
90 | print("linf: ", np.mean(any_linf, axis=-1))
91 |
92 | with tf.Session(config=config_tf) as sess:
93 | for m_idx, (model_dir, epoch) in enumerate(models):
94 | ckpt = get_ckpt(model_dir, epoch)
95 | saver.restore(sess, ckpt)
96 |
97 | print("starting...", model_dir)
98 |
99 | # lp attacks
100 | nat_acc, total_corr_advs = evaluate(model, attacks, sess, eval_config)
101 | nat_accs[m_idx] = nat_acc
102 | adv_acc = np.mean(total_corr_advs, axis=-1)
103 | adv_accs[m_idx, :len(attacks)] = adv_acc
104 | any_attack[m_idx] &= np.bitwise_and.reduce(np.asarray(total_corr_advs), 0)
105 |
106 | print(model_dir, adv_accs[m_idx])
107 | model_name = model_dir.split('/')[-1]
108 | for i, attack in enumerate(attacks):
109 | np.save("results/mnist/{}_{}.npy".format(model_name, attack.name), total_corr_advs[i])
110 |
111 | if attack_configs[i]["type"] == "l1":
112 | any_l1[m_idx] &= total_corr_advs[i]
113 | elif attack_configs[i]["type"] == "l2":
114 | any_l2[m_idx] &= total_corr_advs[i]
115 | else:
116 | any_linf[m_idx] &= total_corr_advs[i]
117 |
118 | print("DONE with PGD!")
119 | print(nat_accs)
120 | print(adv_accs)
121 | print("any: ", np.mean(any_attack, axis=-1))
122 | print("l1: ", np.mean(any_l1, axis=-1))
123 | print("l2: ", np.mean(any_l2, axis=-1))
124 | print("linf: ", np.mean(any_linf, axis=-1))
125 |
126 | with tf.Session(config=config_tf) as sess:
127 | for m_idx, (model_dir, epoch) in enumerate(models):
128 | ckpt = get_ckpt(model_dir, epoch)
129 | saver.restore(sess, ckpt)
130 |
131 | print("starting...", model_dir)
132 | all_corr_nat_inf, all_corr_adv_inf, l_infs = evaluate_bapp(sess, model, eval_config, 0, 1, 0.3, verbose=False)
133 | all_corr_adv_inf = all_corr_adv_inf | ((l_infs > 0.3) & all_corr_nat_inf)
134 | adv_accs[m_idx, len(attacks) + 2] = np.mean(all_corr_adv_inf)
135 | any_attack[m_idx] &= all_corr_adv_inf
136 |
137 | any_linf[m_idx] &= all_corr_adv_inf
138 |
139 | # Cleverhans attacks
140 | g2 = tf.Graph()
141 | with g2.as_default():
142 | with tf.Session(graph=g2, config=config_tf) as sess2:
143 | model2 = get_model(eval_config)
144 | saver2 = get_saver(eval_config)
145 |
146 | for m_idx, (model_dir, epoch) in enumerate(models):
147 | ckpt = get_ckpt(model_dir, epoch)
148 | saver.restore(sess2, ckpt)
149 |
150 | print("starting...", model_dir)
151 |
152 | print("EAD")
153 | all_corr_nat1, all_corr_adv1, l1s = evaluate_ch(model2, eval_config, sess2, "l1", 10, verbose=False)
154 | all_corr_adv1 = all_corr_adv1 | ((l1s > 10) & all_corr_nat1)
155 | adv_accs[m_idx, len(attacks) + 3] = np.mean(all_corr_adv1)
156 | any_attack[m_idx] &= all_corr_adv1
157 |
158 | print("C&W")
159 | all_corr_nat2, all_corr_adv2, l2s = evaluate_ch(model2, eval_config, sess2, "l2", 2, verbose=False)
160 | all_corr_adv2 = all_corr_adv2 | ((l2s > 2.0) & all_corr_nat2)
161 | adv_accs[m_idx, len(attacks) + 4] = np.mean(all_corr_adv2)
162 | any_attack[m_idx] &= all_corr_adv2
163 |
164 | print(model_dir, adv_accs[m_idx])
165 |
166 | model_name = model_dir.split('/')[-1]
167 | any_l1[m_idx] &= all_corr_adv1
168 | any_l2[m_idx] &= all_corr_adv2
169 | np.save("results/mnist/{}_l1_ead.npy".format(model_name), all_corr_adv1)
170 | np.save("results/mnist/{}_l2_cw.npy".format(model_name), all_corr_adv2)
171 |
172 | print(nat_accs)
173 | print(adv_accs)
174 | print("any: ", np.mean(any_attack, axis=-1))
175 | print("l1: ", np.mean(any_l1, axis=-1))
176 | print("l2: ", np.mean(any_l2, axis=-1))
177 | print("linf: ", np.mean(any_linf, axis=-1))
178 |
--------------------------------------------------------------------------------
/scripts/eval_mnist_spatial.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 | import numpy as np
3 | from pgd_attack import PGDAttack, compute_grad
4 | from model import Model
5 | from scripts.utils import get_ckpt
6 | from eval import evaluate
7 |
8 |
9 | models = [
10 | ('path_to_model', -1),
11 | ]
12 |
13 | attack_configs = [
14 | {"type": "linf", "epsilon": 0.3, "k": 100, "random_start": True, "reps": 40},
15 | {"type": "linf", "epsilon": 0.3, "k": 1000, "random_start": True},
16 | {"type": "RT", "spatial_limits": [3, 3, 30], "grid_granularity": [5, 5, 31], "random_tries": 10},
17 | {"type": "RT", "spatial_limits": [3, 3, 30], "grid_granularity": [5, 5, 31], "random_tries": -1}
18 | ]
19 |
20 | model = Model({"model_type": "cnn"})
21 | grad = compute_grad(model)
22 | attacks = [PGDAttack(model, a_config, 0.0, 1.0, grad) for a_config in attack_configs]
23 |
24 | saver = tf.train.Saver()
25 | config_tf = tf.ConfigProto()
26 | config_tf.gpu_options.allow_growth = True
27 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.2
28 |
29 | eval_config = {"data": "mnist",
30 | "model_type": "cnn",
31 | "num_eval_examples": 200,
32 | "eval_batch_size": 100}
33 |
34 | nat_accs = np.zeros(len(models))
35 | adv_accs = np.zeros((len(models), len(attacks)))
36 |
37 | any_attack = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
38 | any_rt = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
39 | any_linf = np.ones((len(models), eval_config["num_eval_examples"])).astype(np.bool)
40 |
41 | with tf.Session(config=config_tf) as sess:
42 | for m_idx, (model_dir, epoch) in enumerate(models):
43 | ckpt = get_ckpt(model_dir, epoch)
44 | saver.restore(sess, ckpt)
45 |
46 | print("starting...", model_dir, epoch)
47 |
48 | # lp attacks
49 | nat_acc, total_corr_advs = evaluate(model, attacks, sess, eval_config)
50 | nat_accs[m_idx] = nat_acc
51 | adv_acc = np.mean(total_corr_advs, axis=-1)
52 | adv_accs[m_idx, :len(attacks)] = adv_acc
53 | any_attack[m_idx] &= np.bitwise_and.reduce(np.asarray(total_corr_advs), 0)
54 |
55 | print(model_dir, adv_accs[m_idx])
56 | model_name = model_dir.split('/')[-1] + "_" + str(epoch)
57 | for i, attack in enumerate(attacks):
58 | np.save("results/mnist/spatial/{}_{}.npy".format(model_name, attack.name), total_corr_advs[i])
59 |
60 | if attack_configs[i]["type"] == "RT":
61 | any_rt[m_idx] &= total_corr_advs[i]
62 | else:
63 | any_linf[m_idx] &= total_corr_advs[i]
64 |
65 | print(nat_accs)
66 | print(adv_accs)
67 | print("any: ", np.mean(any_attack, axis=-1))
68 | print("rt: ", np.mean(any_rt, axis=-1))
69 | print("linf: ", np.mean(any_linf, axis=-1))
70 |
--------------------------------------------------------------------------------
/scripts/utils.py:
--------------------------------------------------------------------------------
1 | import tensorflow as tf
2 |
3 | def get_ckpt(model_dir, epoch):
4 | if epoch is not None and epoch > 0:
5 | ckpts = tf.train.get_checkpoint_state(model_dir).all_model_checkpoint_paths
6 | ckpt = [c for c in ckpts if c.endswith('checkpoint-{}'.format(epoch))]
7 | assert len(ckpt) == 1
8 | cur_checkpoint = ckpt[0]
9 | else:
10 | cur_checkpoint = tf.train.latest_checkpoint(model_dir)
11 | return cur_checkpoint
12 |
13 |
--------------------------------------------------------------------------------
/train.py:
--------------------------------------------------------------------------------
1 | """Trains a model, saving checkpoints and tensorboard summaries along
2 | the way."""
3 | from __future__ import absolute_import
4 | from __future__ import division
5 | from __future__ import print_function
6 |
7 | from datetime import datetime
8 | import json
9 | import os
10 | import shutil
11 | from timeit import default_timer as timer
12 |
13 | import tensorflow as tf
14 | import numpy as np
15 |
16 | from pgd_attack import PGDAttack, compute_grad
17 | from eval import evaluate
18 |
19 | import sys
20 | import logging
21 |
22 | logging.getLogger('tensorflow').setLevel(logging.ERROR)
23 |
24 | model_dir = sys.argv[1]
25 |
26 | try:
27 | with open(model_dir + "/config.json") as config_file:
28 | config = json.load(config_file)
29 | print("opened previous config file")
30 | except IOError:
31 | with open("config.json") as config_file:
32 | config = json.load(config_file)
33 |
34 | # Setting up training parameters
35 | tf.set_random_seed(config['random_seed'])
36 |
37 | max_num_training_steps = config['max_num_training_steps']
38 | num_output_steps = config['num_output_steps']
39 | num_summary_steps = config['num_summary_steps']
40 | num_checkpoint_steps = config['num_checkpoint_steps']
41 |
42 | batch_size = config['training_batch_size']
43 |
44 | dataset = config["data"]
45 | assert dataset in ["mnist", "cifar10"]
46 |
47 | num_train_attacks = len(config["train_attacks"])
48 | multi_attack_mode = config["multi_attack_mode"]
49 | print("num_train_attacks", num_train_attacks)
50 | print("multi_attack_mode", multi_attack_mode)
51 |
52 | step_size_schedule = config['step_size_schedule']
53 | step_size_schedule = np.asarray(step_size_schedule)
54 |
55 | # strategies for training with adversarial examples from K attacks:
56 | #
57 | # HALF_LR: Keeps the clean batch size fixed
58 | # (so the effective batch size is multiplied by K) and divides the learning rate by K
59 | #
60 | # HALF_BATCH: Divides the clean batch size by K (so the ffective batch size remains unchanged).
61 | # This is necessary to avoid memory overflows with the wide ResNet model on CIFAR10
62 | #
63 | if "HALF_LR" in multi_attack_mode:
64 | step_size_schedule[:, 1] *= 1. / num_train_attacks
65 | if "HALF_BATCH" in multi_attack_mode or "ALTERNATE" in multi_attack_mode:
66 | step_size_schedule[:, 0] *= num_train_attacks
67 | max_num_training_steps *= num_train_attacks
68 | max_num_training_steps = int(max_num_training_steps)
69 |
70 | if "HALF_BATCH" in multi_attack_mode:
71 | batch_size *= 1. / num_train_attacks
72 | batch_size = int(batch_size)
73 | print("batch_size", batch_size)
74 |
75 | boundaries = [int(sss[0]) for sss in step_size_schedule]
76 | boundaries = boundaries[1:]
77 | values = [sss[1] for sss in step_size_schedule]
78 | global_step = tf.contrib.framework.get_or_create_global_step()
79 | learning_rate = tf.train.piecewise_constant(
80 | tf.cast(global_step, tf.int32), boundaries, values)
81 |
82 | if dataset == "mnist":
83 | from tensorflow.examples.tutorials.mnist import input_data
84 | from model import Model
85 |
86 | # Setting up the data and the model
87 | mnist = input_data.read_data_sets('MNIST_data', one_hot=False)
88 | num_train_data = 60000
89 | if config["model_type"] == "linear":
90 | x_train = mnist.train.images
91 | y_train = mnist.train.labels
92 | x_test = mnist.test.images
93 | y_test = mnist.test.labels
94 |
95 | pos_train = (y_train == 5) | (y_train == 7)
96 | x_train = x_train[pos_train]
97 | y_train = y_train[pos_train]
98 | y_train = (y_train == 5).astype(np.int64)
99 | pos_test = (y_test == 5) | (y_test == 7)
100 | x_test = x_test[pos_test]
101 | y_test = y_test[pos_test]
102 | y_test = (y_test == 5).astype(np.int64)
103 |
104 | from tensorflow.contrib.learn.python.learn.datasets.mnist import DataSet
105 | from tensorflow.contrib.learn.python.learn.datasets import base
106 |
107 | options = dict(dtype=tf.uint8, reshape=False, seed=None)
108 | train = DataSet(x_train, y_train, **options)
109 | test = DataSet(x_test, y_test, **options)
110 |
111 | mnist = base.Datasets(train=train, validation=None, test=test)
112 | num_train_data = len(x_train)
113 |
114 | model = Model(config)
115 | x_min, x_max = 0.0, 1.0
116 |
117 | # Setting up the optimizer
118 | opt = tf.train.AdamOptimizer(learning_rate)
119 | gv = opt.compute_gradients(model.xent)
120 | train_step = opt.apply_gradients(gv, global_step=global_step)
121 | else:
122 | import cifar10_input
123 | from cifar10_model import Model
124 |
125 | weight_decay = config['weight_decay']
126 | data_path = config['data_path']
127 | momentum = config['momentum']
128 | raw_cifar = cifar10_input.CIFAR10Data(data_path)
129 | num_train_data = 50000
130 | model = Model(config)
131 | x_min, x_max = 0.0, 255.0
132 |
133 | # Setting up the optimizer
134 | total_loss = model.mean_xent + weight_decay * model.weight_decay_loss
135 | opt = tf.train.MomentumOptimizer(learning_rate, momentum)
136 | gv = opt.compute_gradients(total_loss)
137 | train_step = opt.apply_gradients(gv, global_step=global_step)
138 |
139 | num_epochs = (max_num_training_steps * batch_size) // num_train_data
140 | print("num_epochs: {:d}".format(num_epochs))
141 | print("max_num_training_steps", max_num_training_steps)
142 | print("step_size_schedule", step_size_schedule)
143 |
144 | # Set up adversary
145 | grad = compute_grad(model)
146 | train_attack_configs = [np.asarray(config["attacks"])[i] for i in config["train_attacks"]]
147 | eval_attack_configs = [np.asarray(config["attacks"])[i] for i in config["eval_attacks"]]
148 | train_attacks = [PGDAttack(model, a_config, x_min, x_max, grad) for a_config in train_attack_configs]
149 |
150 | # Optimization that works well on MNIST: do a first epoch with a lower epsilon
151 | start_small = config.get("start_small", False)
152 | if start_small:
153 | train_attack_configs_small = [a.copy() for a in train_attack_configs]
154 | for attack in train_attack_configs_small:
155 | if 'epsilon' in attack:
156 | attack['epsilon'] /= 3.0
157 | else:
158 | attack['spatial_limits'] = [s/3.0 for s in attack['spatial_limits']]
159 | train_attacks_small = [PGDAttack(model, a_config, x_min, x_max, grad) for a_config in train_attack_configs_small]
160 | print('start_small', start_small)
161 |
162 | eval_attacks = [PGDAttack(model, a_config, x_min, x_max, grad) for a_config in eval_attack_configs]
163 |
164 | # Setting up the Tensorboard and checkpoint outputs
165 | if not os.path.exists(model_dir):
166 | os.makedirs(model_dir)
167 | shutil.copy('config.json', model_dir)
168 |
169 | eval_dir = os.path.join(model_dir, 'eval')
170 | if not os.path.exists(eval_dir):
171 | os.makedirs(eval_dir)
172 |
173 | train_dir = os.path.join(model_dir, 'train')
174 | if not os.path.exists(train_dir):
175 | os.makedirs(train_dir)
176 |
177 | saver = tf.train.Saver(max_to_keep=100)
178 | tf.summary.scalar('accuracy adv train', model.accuracy, collections=['adv'])
179 | tf.summary.scalar('xent adv train', model.mean_xent, collections=['adv'])
180 | tf.summary.image('images adv train', model.x_image, collections=['adv'])
181 | adv_summaries = tf.summary.merge_all('adv')
182 |
183 | tf.summary.scalar('accuracy_nat_train', model.accuracy, collections=['nat'])
184 | tf.summary.scalar('xent_nat_train', model.mean_xent, collections=['nat'])
185 | tf.summary.scalar('learning_rate', learning_rate, collections=['nat'])
186 | nat_summaries = tf.summary.merge_all('nat')
187 |
188 | eval_summaries_train = []
189 | for i, attack in enumerate(eval_attacks):
190 | a_type = attack.name
191 | tf.summary.scalar('accuracy adv train {}'.format(a_type), model.accuracy, collections=['adv_{}'.format(i)])
192 | tf.summary.scalar('xent adv train {}'.format(a_type), model.mean_xent, collections=['adv_{}'.format(i)])
193 | tf.summary.image('images adv train {}'.format(a_type), model.x_image, collections=['adv_{}'.format(i)])
194 | eval_summaries_train.append(tf.summary.merge_all('adv_{}'.format(i)))
195 |
196 | config_tf = tf.ConfigProto()
197 | config_tf.gpu_options.allow_growth = True
198 | if dataset == "mnist":
199 | config_tf.gpu_options.per_process_gpu_memory_fraction = 0.2
200 | else:
201 | config_tf.gpu_options.per_process_gpu_memory_fraction = 1.0
202 | config_tf.allow_soft_placement = True
203 |
204 | with tf.Session(config=config_tf) as sess:
205 | if dataset == "cifar10":
206 | # initialize data augmentation
207 | cifar = cifar10_input.AugmentedCIFAR10Data(raw_cifar, sess)
208 |
209 | # Initialize the summary writer, global variables, and our time counter.
210 | summary_writer = tf.summary.FileWriter(train_dir, sess.graph)
211 | test_summary_writer = tf.summary.FileWriter(eval_dir)
212 | sess.run(tf.global_variables_initializer())
213 | training_time = 0.0
214 |
215 | cur_checkpoint = tf.train.latest_checkpoint(model_dir)
216 | if cur_checkpoint is not None:
217 | saver.restore(sess, cur_checkpoint)
218 | else:
219 | print("no checkpoint to load")
220 |
221 | start_step = sess.run(global_step)
222 |
223 | # Main training loop
224 | for ii in range(start_step, max_num_training_steps + 1):
225 | curr_epoch = (ii * batch_size) // num_train_data
226 |
227 | if dataset == "mnist":
228 | x_batch, y_batch = mnist.train.next_batch(batch_size)
229 | x_batch = x_batch.reshape(-1, 28, 28, 1)
230 | x_batch_no_aug = x_batch
231 | else:
232 | x_batch_no_aug, x_batch, y_batch = cifar.train_data.get_next_batch(batch_size, multiple_passes=True)
233 | x_batch_no_aug = x_batch_no_aug.astype(np.float32)
234 | x_batch = x_batch.astype(np.float32)
235 |
236 | noop_trans = np.zeros([len(x_batch), 3])
237 |
238 | if start_small and curr_epoch == 0:
239 | curr_train_attacks = train_attacks_small
240 | else:
241 | curr_train_attacks = train_attacks
242 |
243 | # Compute Adversarial Perturbations
244 | start = timer()
245 | if multi_attack_mode == "ALTERNATE":
246 | # alternate between attacks each batch (does not work verywell)
247 | curr_attack = curr_train_attacks[ii % num_train_attacks]
248 | adv_outputs = [curr_attack.perturb(x_batch, y_batch, sess, x_nat_no_aug=x_batch_no_aug)]
249 |
250 | elif multi_attack_mode == "MAX":
251 | # choose best attack for each input
252 | adv_outputs = [attack.perturb(x_batch, y_batch, sess, x_nat_no_aug=x_batch_no_aug) for attack in curr_train_attacks]
253 | losses = np.zeros((num_train_attacks, len(x_batch)))
254 | for j in range(num_train_attacks):
255 | x = adv_outputs[j][0]
256 | t = adv_outputs[j][1]
257 | losses[j] = sess.run(model.y_xent,
258 | feed_dict={model.x_input: x,
259 | model.y_input: y_batch,
260 | model.is_training: False,
261 | model.transform: t if t is not None else noop_trans})
262 | best_idx = np.argmax(losses, axis=0) # shape (batch_size,)
263 | best_x = np.asarray([adv_outputs[best_idx[j]][0][j] for j in range(len(x_batch))])
264 | best_t = np.asarray([adv_outputs[best_idx[j]][1][j] for j in range(len(x_batch))])
265 | adv_outputs = [(best_x, best_t)]
266 |
267 | else:
268 | # concatenate multiple attacks (default)
269 | adv_outputs = [attack.perturb(x_batch, y_batch, sess, x_nat_no_aug=x_batch_no_aug) for attack in curr_train_attacks]
270 |
271 | x_batch_advs = [a[0] for a in adv_outputs]
272 | all_trans = [a[1] if a[1] is not None else noop_trans for a in adv_outputs]
273 | end = timer()
274 | training_time += end - start
275 |
276 | nat_dict = {model.x_input: x_batch,
277 | model.y_input: y_batch,
278 | model.is_training: False,
279 | model.transform: noop_trans}
280 |
281 | if num_train_attacks > 0:
282 | x_batch_adv = np.concatenate(x_batch_advs)
283 | y_batch_adv = np.concatenate([y_batch for _ in range(len(x_batch_advs))])
284 | trans_adv = np.concatenate(all_trans)
285 |
286 | adv_dict = {model.x_input: x_batch_adv,
287 | model.y_input: y_batch_adv,
288 | model.is_training: False,
289 | model.transform: trans_adv}
290 | else:
291 | adv_dict = nat_dict
292 |
293 | if ii % num_output_steps == 0:
294 | print('Step {} (epoch {}): ({})'.format(ii, curr_epoch, datetime.now()))
295 | if ii > 0:
296 | print(' {} examples per second'.format(num_output_steps * batch_size / training_time))
297 | training_time = 0.0
298 | summary = sess.run(adv_summaries, feed_dict=adv_dict)
299 | summary_writer.add_summary(summary, global_step.eval(sess))
300 | summary = sess.run(nat_summaries, feed_dict=nat_dict)
301 | summary_writer.add_summary(summary, global_step.eval(sess))
302 |
303 | # Output to stdout and tensorboard summaries
304 | if ii % num_summary_steps == 0:
305 | nat_acc = sess.run(model.accuracy, feed_dict=nat_dict)
306 | print(' training nat accuracy {:.4}%'.format(nat_acc * 100))
307 |
308 | for a_idx, attack in enumerate(eval_attacks):
309 | x_batch_adv_eval, trans_eval = attack.perturb(x_batch, y_batch, sess, x_nat_no_aug=x_batch_no_aug)
310 |
311 | adv_dict_eval = {model.x_input: x_batch_adv_eval,
312 | model.y_input: y_batch,
313 | model.is_training: False,
314 | model.transform: trans_eval if trans_eval is not None else noop_trans}
315 |
316 | adv_acc = sess.run(model.accuracy, feed_dict=adv_dict_eval)
317 | print(' training adv accuracy ({}) {:.4}%'.format(attack.name, adv_acc * 100))
318 |
319 | summary = sess.run(eval_summaries_train[a_idx], feed_dict=adv_dict_eval)
320 | summary_writer.add_summary(summary, global_step.eval(sess))
321 |
322 | evaluate(model, eval_attacks, sess, config, plot=False,
323 | summary_writer=test_summary_writer, eval_train=False)
324 |
325 | # Write a checkpoint
326 | if ii % num_checkpoint_steps == 0 and ii > 0:
327 | saver.save(sess, os.path.join(model_dir, 'checkpoint'), global_step=global_step)
328 |
329 | # Actual training step
330 | start = timer()
331 | adv_dict[model.is_training] = True
332 | _, curr_gv = sess.run([train_step, [g for (g, v) in gv]], feed_dict=adv_dict)
333 | end = timer()
334 | training_time += end - start
335 |
--------------------------------------------------------------------------------