├── .github
    └── ISSUE_TEMPLATE.md
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── INSTRUCTIONS.md
├── LICENSE
├── README.md
├── adv_model.py
├── inference-example.py
├── main.py
├── nets.py
├── resnet_model.py
├── slurm
    ├── eval.sh
    └── train.sh
├── teaser.jpg
├── third_party
    ├── README.md
    ├── __init__.py
    ├── imagenet_utils.py
    ├── serve-data.py
    └── utils.py
└── tox.ini


/.github/ISSUE_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | If you met an unexpected problem when using the code, please include the following in your issues:
2 | 
3 | 1. What you did: the command you run.
4 | 
5 | 2. What you observed: the full logs and other relevant information.
6 | 
7 | 3. What you expected, if not obvious.
8 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | train_log
 2 | train_log_*
 3 | logs
 4 | *.npy
 5 | *.npz
 6 | *.caffemodel
 7 | *.tfmodel
 8 | *.meta
 9 | *.log*
10 | *.bin
11 | *.png
12 | *.jpg
13 | checkpoint
14 | *.json
15 | *.prototxt
16 | *.txt
17 | *.tgz
18 | *.gz
19 | 
20 | 
21 | # Byte-compiled / optimized / DLL files
22 | __pycache__/
23 | *.py[cod]
24 | 
25 | # C extensions
26 | *.so
27 | 
28 | # Distribution / packaging
29 | .Python
30 | env/
31 | build/
32 | develop-eggs/
33 | dist/
34 | downloads/
35 | eggs/
36 | .eggs/
37 | lib/
38 | lib64/
39 | parts/
40 | sdist/
41 | var/
42 | *.egg-info/
43 | .installed.cfg
44 | *.egg
45 | 
46 | # PyInstaller
47 | #  Usually these files are written by a python script from a template
48 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
49 | *.manifest
50 | *.spec
51 | 
52 | # Installer logs
53 | pip-log.txt
54 | pip-delete-this-directory.txt
55 | 
56 | # Unit test / coverage reports
57 | htmlcov/
58 | .tox/
59 | .coverage
60 | .coverage.*
61 | .cache
62 | nosetests.xml
63 | coverage.xml
64 | *,cover
65 | 
66 | # Translations
67 | *.mo
68 | *.pot
69 | 
70 | # Django stuff:
71 | *.log
72 | 
73 | # Sphinx documentation
74 | docs/_build/
75 | 
76 | # PyBuilder
77 | target/
78 | *.dat
79 | 
80 | .idea/
81 | *.diff
82 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | # Code of Conduct
2 | 
3 | Facebook has adopted a Code of Conduct that we expect project participants to adhere to.
4 | Please read the [full text](https://code.fb.com/codeofconduct/)
5 | so that you can understand what actions will and will not be tolerated.
6 | 
7 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing
 2 | 
 3 | We want to make contributing to this project as easy and transparent as
 4 | possible.
 5 | 
 6 | ## Our Development Process
 7 | This code is released for the purpose of reproducing research.
 8 | We don't expect frequent developement on this project unless we have new research results to share.
 9 | Minor changes and improvements will be released on an ongoing basis. Larger
10 | changes (e.g., changesets implementing a new paper) will be released on a more
11 | periodic basis.
12 | 
13 | ## Contributor License Agreement ("CLA")
14 | In order to accept your pull request, we need you to submit a CLA. You only need
15 | to do this once to work on any of Facebook's open source projects.
16 | 
17 | Complete your CLA here: <https://code.facebook.com/cla>
18 | 
19 | ## Issues
20 | 
21 | We welcome the use of github issues for any questions you might have about the code.
22 | 
23 | ## Coding Style
24 | 
25 | We mainly follow PEP8 style.
26 | 
27 | ## License
28 | By contributing to this project, you agree that your contributions will be licensed
29 | under the LICENSE file in the root directory of this source tree.
30 | 
31 | 


--------------------------------------------------------------------------------
/INSTRUCTIONS.md:
--------------------------------------------------------------------------------
  1 | 
  2 | ## Dependencies:
  3 | 
  4 | + TensorFlow ≥ 1.6 with GPU support
  5 | + Tensorpack ≥ 0.9.8
  6 | + OpenCV ≥ 3
  7 | + horovod ≥ 0.15 with NCCL support
  8 |   + horovod has many [installation options](https://github.com/uber/horovod/blob/master/docs/gpus.md) to optimize its multi-machine/multi-GPU performance.
  9 |     You might want to follow them.
 10 | + ImageNet data in its standard directory structure.
 11 | + TensorFlow [zmq_ops](https://github.com/tensorpack/zmq_ops) (needed only for training with real data)
 12 | 
 13 | 
 14 | ## Model Zoo:
 15 | <table>
 16 | <thead>
 17 | <tr>
 18 | <th align="left" rowspan=2>Model (click for details)</th>
 19 | <th align="center">error rate (%)</th>
 20 | <th align="center" colspan=3>error rate / attack success rate (%)</th>
 21 | </tr>
 22 | <tr>
 23 | <th align="center">clean images</th>
 24 | <th align="center">10-step PGD</th>
 25 | <th align="center">100-step PGD</th>
 26 | <th align="center">1000-step PGD</th>
 27 | </tr>
 28 | </thead>
 29 | 
 30 | 
 31 | <tbody>
 32 | <tr>
 33 | <td align="left"><details><summary>ResNet152 Baseline </summary> <code>--arch ResNet -d 152</code>
 34 | <a href="https://github.com/facebookresearch/ImageNet-Adversarial-Training/releases/download/v0/R152.npz"> :arrow_down: </a>   </details></td>
 35 | <td align="center">37.7</td>
 36 | <td align="center">47.5/5.5</td>
 37 | <td align="center">58.3/31.0</td>
 38 | <td align="center">61.0/36.1</td>
 39 | </tr>
 40 | 
 41 | <tr>
 42 | <td align="left"><details><summary>ResNet152 Denoise  </summary> <code>--arch ResNetDenoise -d 152</code>
 43 | <a href="https://github.com/facebookresearch/ImageNet-Adversarial-Training/releases/download/v0.1/R152-Denoise.npz"> :arrow_down: </a> </details></td>
 44 | <td align="center">34.7</td>
 45 | <td align="center">44.3/4.9</td>
 46 | <td align="center">54.5/26.6</td>
 47 | <td align="center">57.2/32.7</td>
 48 | </tr>
 49 | 
 50 | <tr>
 51 | <td align="left"><details><summary>ResNeXt101 DenoiseAll   </summary><code>--arch ResNeXtDenoiseAll</code> <br> <code>-d 101</code>
 52 | <a href="https://github.com/facebookresearch/ImageNet-Adversarial-Training/releases/download/v0.2/X101-DenoiseAll.npz"> :arrow_down: </a>
 53 | </details></td>
 54 | <td align="center">31.6</td>
 55 | <td align="center">44.0/4.9</td>
 56 | <td align="center">55.6/31.5</td>
 57 | <td align="center">59.6/38.1</td>
 58 | </tr>
 59 | </tbody>
 60 | </table>
 61 | 
 62 | 
 63 | Click the first column to download the model and obtain the flags to be used with the script.
 64 | 
 65 | Note:
 66 | 
 67 | 1. As mentioned in the paper, the threat model is:
 68 | 
 69 |    1. __Targeted attack__, with one target label associated with each image. The target label is
 70 | 			independently generated by uniformly sampling the incorrect labels.
 71 |    2. Maximum perturbation per pixel is 16.
 72 | 
 73 |    We do not consider untargeted attacks, nor do we let the attacker control the target labels,
 74 |    because we think such tasks are not realistic on the ImageNet-1k categories.
 75 | 
 76 | 2. For each (attacker, model) pair, we provide both the __error rate__ of our model,
 77 |    and the __attack success rate__ of the attacker, on ImageNet validation set.
 78 |    A targeted attack is considered successful if the image is classified to the target label.
 79 | 
 80 |    __For attackers__, if you develop a new targeted attack method against our models,
 81 |    *please compare its attack success rate* with PGD.
 82 |    Error rate / accuracy is not a reasonable metric, because then the method can cheat by becoming
 83 |    close to untargeted attacks.
 84 | 
 85 |    __For defenders__, if you develop a new robust model, please compare its accuracy with our models.
 86 |    Attack success rate is not a reasonable metric, because then the model can cheat by making random predictions.
 87 | 
 88 | 3. `ResNeXt101 DenoiseAll` is the submission that won the champion of
 89 |    black-box defense track in [Competition on Adversarial Attacks and Defenses 2018](http://hof.geekpwn.org/caad/en/index.html).
 90 | 	 This model was trained with slightly different training settings
 91 | 	 therefore its results are not directly comparable with other models.
 92 | 
 93 | 
 94 | ## Evaluate White-Box Robustness:
 95 | 
 96 | To evaluate on one GPU, run this command:
 97 | ```
 98 | python main.py --eval --load /path/to/model_checkpoint --data /path/to/imagenet \
 99 |   --attack-iter [INTEGER] --attack-epsilon 16.0 [--architecture-flags]
100 | ```
101 | 
102 | To reproduce our evaluation results,
103 | take "architecture flags" from the first column in the model zoo, and set the attack iteration.
104 | Iteration can be set to 0 to evaluate its clean image error rate.
105 | Note that the evaluation result may have a ±0.3 fluctuation due to the
106 | randomly-chosen target attack label and attack initialization.
107 | 
108 | Using a K-step attacker makes the evaluation K-times slower.
109 | To speed up evaluation, run it under MPI with multi-GPU or multiple machines, e.g.:
110 | 
111 | ```
112 | mpirun -np 8 python main.py --eval --load /path/to/model_checkpoint --data /path/to/imagenet \
113 |   --attack-iter [INTEGER] --attack-epsilon 16.0 [--architecture-flags]
114 | ```
115 | 
116 | Evaluating the `Res152 Denoise` model against 100-step PGD attackers takes about 1 hour with 16 V100s.
117 | 
118 | 
119 | ## Evaluate Black-Box Robustness:
120 | 
121 | We provide a command line option to produce predictions for an image directory, e.g.:
122 | ```
123 | python main.py --eval-directory /path/to/image/directory --prediction-file predictions.txt \
124 | 	--load X101-DenseDenoise.npz -d 101 --arch ResNeXtDenoiseAll --batch 20
125 | ```
126 | 
127 | This will produce a file "predictions.txt" which contains the filename and
128 | predicted label for each image found in the directory.
129 | You can use this to evaluate its black-box robustness.
130 | 
131 | Our CAAD2018 submission is equivalent to the above command.
132 | 
133 | ## Train:
134 | 
135 | Our code can be used for both standard ImageNet training (with `--attack-iter 0`) and adversarial training.
136 | Adversarial training takes a long time and we recommend doing it only when you have a lot of GPUs.
137 | 
138 | To train, first start one data serving process __on each machine__:
139 | ```
140 | $ ./third_party/serve-data.py --data /path/to/imagenet/ --batch 32
141 | ```
142 | 
143 | Then, launch a distributed job with MPI. You may need to consult your cluster
144 | administrator for the MPI command line arguments you should use.
145 | On a cluster with InfiniBand, it may look like this:
146 | 
147 | ```
148 |  mpirun -np 16 -H host1:8,host2:8 --output-filename train.log \
149 |     -bind-to none -map-by slot -mca pml ob1 \
150 |     -x NCCL_IB_CUDA_SUPPORT=1 -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO \
151 |     -x PATH -x PYTHONPATH -x LD_LIBRARY_PATH \
152 |     python main.py --data /path/to/imagenet \
153 |         --batch 32 --attack-iter [INTEGER] --attack-epsilon 16.0 [--architecture-flags]
154 | ```
155 | 
156 | If your cluster is managed by slurm , we provide some sample [slurm job scripts](slurm/)
157 | for your reference.
158 | 
159 | The training code will also perform distributed evaluation of white-box robustness.
160 | 
161 | ### Training Speed:
162 | 
163 | With 30 attack iterations during training,
164 | the `Res152 Baseline` model takes about 52 hours to finish training on 128 V100s.
165 | 
166 | Under the same setting, the `Res152 Denoise` model takes about 90 hours on 128 V100s.
167 | Note that the model actually does not add much computation to the baseline,
168 | but it lacks efficient GPU implementation for the softmax version of non-local operation.
169 | The dot-product version, on the other hand, is much faster.
170 | 
171 | If you use CUDA≥9.2, TF≥1.12 on Volta GPUs, the flag `--use-fp16xla` will enable XLA-optimized
172 | FP16 PGD attack, which reduces training time about 2x, with a drop of about 3% robustness.
173 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
  1 | Attribution-NonCommercial 4.0 International
  2 | 
  3 | =======================================================================
  4 | 
  5 | Creative Commons Corporation ("Creative Commons") is not a law firm and
  6 | does not provide legal services or legal advice. Distribution of
  7 | Creative Commons public licenses does not create a lawyer-client or
  8 | other relationship. Creative Commons makes its licenses and related
  9 | information available on an "as-is" basis. Creative Commons gives no
 10 | warranties regarding its licenses, any material licensed under their
 11 | terms and conditions, or any related information. Creative Commons
 12 | disclaims all liability for damages resulting from their use to the
 13 | fullest extent possible.
 14 | 
 15 | Using Creative Commons Public Licenses
 16 | 
 17 | Creative Commons public licenses provide a standard set of terms and
 18 | conditions that creators and other rights holders may use to share
 19 | original works of authorship and other material subject to copyright
 20 | and certain other rights specified in the public license below. The
 21 | following considerations are for informational purposes only, are not
 22 | exhaustive, and do not form part of our licenses.
 23 | 
 24 |      Considerations for licensors: Our public licenses are
 25 |      intended for use by those authorized to give the public
 26 |      permission to use material in ways otherwise restricted by
 27 |      copyright and certain other rights. Our licenses are
 28 |      irrevocable. Licensors should read and understand the terms
 29 |      and conditions of the license they choose before applying it.
 30 |      Licensors should also secure all rights necessary before
 31 |      applying our licenses so that the public can reuse the
 32 |      material as expected. Licensors should clearly mark any
 33 |      material not subject to the license. This includes other CC-
 34 |      licensed material, or material used under an exception or
 35 |      limitation to copyright. More considerations for licensors:
 36 | 	wiki.creativecommons.org/Considerations_for_licensors
 37 | 
 38 |      Considerations for the public: By using one of our public
 39 |      licenses, a licensor grants the public permission to use the
 40 |      licensed material under specified terms and conditions. If
 41 |      the licensor's permission is not necessary for any reason--for
 42 |      example, because of any applicable exception or limitation to
 43 |      copyright--then that use is not regulated by the license. Our
 44 |      licenses grant only permissions under copyright and certain
 45 |      other rights that a licensor has authority to grant. Use of
 46 |      the licensed material may still be restricted for other
 47 |      reasons, including because others have copyright or other
 48 |      rights in the material. A licensor may make special requests,
 49 |      such as asking that all changes be marked or described.
 50 |      Although not required by our licenses, you are encouraged to
 51 |      respect those requests where reasonable. More_considerations
 52 |      for the public:
 53 | 	wiki.creativecommons.org/Considerations_for_licensees
 54 | 
 55 | =======================================================================
 56 | 
 57 | Creative Commons Attribution-NonCommercial 4.0 International Public
 58 | License
 59 | 
 60 | By exercising the Licensed Rights (defined below), You accept and agree
 61 | to be bound by the terms and conditions of this Creative Commons
 62 | Attribution-NonCommercial 4.0 International Public License ("Public
 63 | License"). To the extent this Public License may be interpreted as a
 64 | contract, You are granted the Licensed Rights in consideration of Your
 65 | acceptance of these terms and conditions, and the Licensor grants You
 66 | such rights in consideration of benefits the Licensor receives from
 67 | making the Licensed Material available under these terms and
 68 | conditions.
 69 | 
 70 | Section 1 -- Definitions.
 71 | 
 72 |   a. Adapted Material means material subject to Copyright and Similar
 73 |      Rights that is derived from or based upon the Licensed Material
 74 |      and in which the Licensed Material is translated, altered,
 75 |      arranged, transformed, or otherwise modified in a manner requiring
 76 |      permission under the Copyright and Similar Rights held by the
 77 |      Licensor. For purposes of this Public License, where the Licensed
 78 |      Material is a musical work, performance, or sound recording,
 79 |      Adapted Material is always produced where the Licensed Material is
 80 |      synched in timed relation with a moving image.
 81 | 
 82 |   b. Adapter's License means the license You apply to Your Copyright
 83 |      and Similar Rights in Your contributions to Adapted Material in
 84 |      accordance with the terms and conditions of this Public License.
 85 | 
 86 |   c. Copyright and Similar Rights means copyright and/or similar rights
 87 |      closely related to copyright including, without limitation,
 88 |      performance, broadcast, sound recording, and Sui Generis Database
 89 |      Rights, without regard to how the rights are labeled or
 90 |      categorized. For purposes of this Public License, the rights
 91 |      specified in Section 2(b)(1)-(2) are not Copyright and Similar
 92 |      Rights.
 93 |   d. Effective Technological Measures means those measures that, in the
 94 |      absence of proper authority, may not be circumvented under laws
 95 |      fulfilling obligations under Article 11 of the WIPO Copyright
 96 |      Treaty adopted on December 20, 1996, and/or similar international
 97 |      agreements.
 98 | 
 99 |   e. Exceptions and Limitations means fair use, fair dealing, and/or
100 |      any other exception or limitation to Copyright and Similar Rights
101 |      that applies to Your use of the Licensed Material.
102 | 
103 |   f. Licensed Material means the artistic or literary work, database,
104 |      or other material to which the Licensor applied this Public
105 |      License.
106 | 
107 |   g. Licensed Rights means the rights granted to You subject to the
108 |      terms and conditions of this Public License, which are limited to
109 |      all Copyright and Similar Rights that apply to Your use of the
110 |      Licensed Material and that the Licensor has authority to license.
111 | 
112 |   h. Licensor means the individual(s) or entity(ies) granting rights
113 |      under this Public License.
114 | 
115 |   i. NonCommercial means not primarily intended for or directed towards
116 |      commercial advantage or monetary compensation. For purposes of
117 |      this Public License, the exchange of the Licensed Material for
118 |      other material subject to Copyright and Similar Rights by digital
119 |      file-sharing or similar means is NonCommercial provided there is
120 |      no payment of monetary compensation in connection with the
121 |      exchange.
122 | 
123 |   j. Share means to provide material to the public by any means or
124 |      process that requires permission under the Licensed Rights, such
125 |      as reproduction, public display, public performance, distribution,
126 |      dissemination, communication, or importation, and to make material
127 |      available to the public including in ways that members of the
128 |      public may access the material from a place and at a time
129 |      individually chosen by them.
130 | 
131 |   k. Sui Generis Database Rights means rights other than copyright
132 |      resulting from Directive 96/9/EC of the European Parliament and of
133 |      the Council of 11 March 1996 on the legal protection of databases,
134 |      as amended and/or succeeded, as well as other essentially
135 |      equivalent rights anywhere in the world.
136 | 
137 |   l. You means the individual or entity exercising the Licensed Rights
138 |      under this Public License. Your has a corresponding meaning.
139 | 
140 | Section 2 -- Scope.
141 | 
142 |   a. License grant.
143 | 
144 |        1. Subject to the terms and conditions of this Public License,
145 |           the Licensor hereby grants You a worldwide, royalty-free,
146 |           non-sublicensable, non-exclusive, irrevocable license to
147 |           exercise the Licensed Rights in the Licensed Material to:
148 | 
149 |             a. reproduce and Share the Licensed Material, in whole or
150 |                in part, for NonCommercial purposes only; and
151 | 
152 |             b. produce, reproduce, and Share Adapted Material for
153 |                NonCommercial purposes only.
154 | 
155 |        2. Exceptions and Limitations. For the avoidance of doubt, where
156 |           Exceptions and Limitations apply to Your use, this Public
157 |           License does not apply, and You do not need to comply with
158 |           its terms and conditions.
159 | 
160 |        3. Term. The term of this Public License is specified in Section
161 |           6(a).
162 | 
163 |        4. Media and formats; technical modifications allowed. The
164 |           Licensor authorizes You to exercise the Licensed Rights in
165 |           all media and formats whether now known or hereafter created,
166 |           and to make technical modifications necessary to do so. The
167 |           Licensor waives and/or agrees not to assert any right or
168 |           authority to forbid You from making technical modifications
169 |           necessary to exercise the Licensed Rights, including
170 |           technical modifications necessary to circumvent Effective
171 |           Technological Measures. For purposes of this Public License,
172 |           simply making modifications authorized by this Section 2(a)
173 |           (4) never produces Adapted Material.
174 | 
175 |        5. Downstream recipients.
176 | 
177 |             a. Offer from the Licensor -- Licensed Material. Every
178 |                recipient of the Licensed Material automatically
179 |                receives an offer from the Licensor to exercise the
180 |                Licensed Rights under the terms and conditions of this
181 |                Public License.
182 | 
183 |             b. No downstream restrictions. You may not offer or impose
184 |                any additional or different terms or conditions on, or
185 |                apply any Effective Technological Measures to, the
186 |                Licensed Material if doing so restricts exercise of the
187 |                Licensed Rights by any recipient of the Licensed
188 |                Material.
189 | 
190 |        6. No endorsement. Nothing in this Public License constitutes or
191 |           may be construed as permission to assert or imply that You
192 |           are, or that Your use of the Licensed Material is, connected
193 |           with, or sponsored, endorsed, or granted official status by,
194 |           the Licensor or others designated to receive attribution as
195 |           provided in Section 3(a)(1)(A)(i).
196 | 
197 |   b. Other rights.
198 | 
199 |        1. Moral rights, such as the right of integrity, are not
200 |           licensed under this Public License, nor are publicity,
201 |           privacy, and/or other similar personality rights; however, to
202 |           the extent possible, the Licensor waives and/or agrees not to
203 |           assert any such rights held by the Licensor to the limited
204 |           extent necessary to allow You to exercise the Licensed
205 |           Rights, but not otherwise.
206 | 
207 |        2. Patent and trademark rights are not licensed under this
208 |           Public License.
209 | 
210 |        3. To the extent possible, the Licensor waives any right to
211 |           collect royalties from You for the exercise of the Licensed
212 |           Rights, whether directly or through a collecting society
213 |           under any voluntary or waivable statutory or compulsory
214 |           licensing scheme. In all other cases the Licensor expressly
215 |           reserves any right to collect such royalties, including when
216 |           the Licensed Material is used other than for NonCommercial
217 |           purposes.
218 | 
219 | Section 3 -- License Conditions.
220 | 
221 | Your exercise of the Licensed Rights is expressly made subject to the
222 | following conditions.
223 | 
224 |   a. Attribution.
225 | 
226 |        1. If You Share the Licensed Material (including in modified
227 |           form), You must:
228 | 
229 |             a. retain the following if it is supplied by the Licensor
230 |                with the Licensed Material:
231 | 
232 |                  i. identification of the creator(s) of the Licensed
233 |                     Material and any others designated to receive
234 |                     attribution, in any reasonable manner requested by
235 |                     the Licensor (including by pseudonym if
236 |                     designated);
237 | 
238 |                 ii. a copyright notice;
239 | 
240 |                iii. a notice that refers to this Public License;
241 | 
242 |                 iv. a notice that refers to the disclaimer of
243 |                     warranties;
244 | 
245 |                  v. a URI or hyperlink to the Licensed Material to the
246 |                     extent reasonably practicable;
247 | 
248 |             b. indicate if You modified the Licensed Material and
249 |                retain an indication of any previous modifications; and
250 | 
251 |             c. indicate the Licensed Material is licensed under this
252 |                Public License, and include the text of, or the URI or
253 |                hyperlink to, this Public License.
254 | 
255 |        2. You may satisfy the conditions in Section 3(a)(1) in any
256 |           reasonable manner based on the medium, means, and context in
257 |           which You Share the Licensed Material. For example, it may be
258 |           reasonable to satisfy the conditions by providing a URI or
259 |           hyperlink to a resource that includes the required
260 |           information.
261 | 
262 |        3. If requested by the Licensor, You must remove any of the
263 |           information required by Section 3(a)(1)(A) to the extent
264 |           reasonably practicable.
265 | 
266 |        4. If You Share Adapted Material You produce, the Adapter's
267 |           License You apply must not prevent recipients of the Adapted
268 |           Material from complying with this Public License.
269 | 
270 | Section 4 -- Sui Generis Database Rights.
271 | 
272 | Where the Licensed Rights include Sui Generis Database Rights that
273 | apply to Your use of the Licensed Material:
274 | 
275 |   a. for the avoidance of doubt, Section 2(a)(1) grants You the right
276 |      to extract, reuse, reproduce, and Share all or a substantial
277 |      portion of the contents of the database for NonCommercial purposes
278 |      only;
279 | 
280 |   b. if You include all or a substantial portion of the database
281 |      contents in a database in which You have Sui Generis Database
282 |      Rights, then the database in which You have Sui Generis Database
283 |      Rights (but not its individual contents) is Adapted Material; and
284 | 
285 |   c. You must comply with the conditions in Section 3(a) if You Share
286 |      all or a substantial portion of the contents of the database.
287 | 
288 | For the avoidance of doubt, this Section 4 supplements and does not
289 | replace Your obligations under this Public License where the Licensed
290 | Rights include other Copyright and Similar Rights.
291 | 
292 | Section 5 -- Disclaimer of Warranties and Limitation of Liability.
293 | 
294 |   a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE
295 |      EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS
296 |      AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF
297 |      ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS,
298 |      IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION,
299 |      WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR
300 |      PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS,
301 |      ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT
302 |      KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT
303 |      ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU.
304 | 
305 |   b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE
306 |      TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION,
307 |      NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT,
308 |      INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES,
309 |      COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR
310 |      USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN
311 |      ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR
312 |      DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR
313 |      IN PART, THIS LIMITATION MAY NOT APPLY TO YOU.
314 | 
315 |   c. The disclaimer of warranties and limitation of liability provided
316 |      above shall be interpreted in a manner that, to the extent
317 |      possible, most closely approximates an absolute disclaimer and
318 |      waiver of all liability.
319 | 
320 | Section 6 -- Term and Termination.
321 | 
322 |   a. This Public License applies for the term of the Copyright and
323 |      Similar Rights licensed here. However, if You fail to comply with
324 |      this Public License, then Your rights under this Public License
325 |      terminate automatically.
326 | 
327 |   b. Where Your right to use the Licensed Material has terminated under
328 |      Section 6(a), it reinstates:
329 | 
330 |        1. automatically as of the date the violation is cured, provided
331 |           it is cured within 30 days of Your discovery of the
332 |           violation; or
333 | 
334 |        2. upon express reinstatement by the Licensor.
335 | 
336 |      For the avoidance of doubt, this Section 6(b) does not affect any
337 |      right the Licensor may have to seek remedies for Your violations
338 |      of this Public License.
339 | 
340 |   c. For the avoidance of doubt, the Licensor may also offer the
341 |      Licensed Material under separate terms or conditions or stop
342 |      distributing the Licensed Material at any time; however, doing so
343 |      will not terminate this Public License.
344 | 
345 |   d. Sections 1, 5, 6, 7, and 8 survive termination of this Public
346 |      License.
347 | 
348 | Section 7 -- Other Terms and Conditions.
349 | 
350 |   a. The Licensor shall not be bound by any additional or different
351 |      terms or conditions communicated by You unless expressly agreed.
352 | 
353 |   b. Any arrangements, understandings, or agreements regarding the
354 |      Licensed Material not stated herein are separate from and
355 |      independent of the terms and conditions of this Public License.
356 | 
357 | Section 8 -- Interpretation.
358 | 
359 |   a. For the avoidance of doubt, this Public License does not, and
360 |      shall not be interpreted to, reduce, limit, restrict, or impose
361 |      conditions on any use of the Licensed Material that could lawfully
362 |      be made without permission under this Public License.
363 | 
364 |   b. To the extent possible, if any provision of this Public License is
365 |      deemed unenforceable, it shall be automatically reformed to the
366 |      minimum extent necessary to make it enforceable. If the provision
367 |      cannot be reformed, it shall be severed from this Public License
368 |      without affecting the enforceability of the remaining terms and
369 |      conditions.
370 | 
371 |   c. No term or condition of this Public License will be waived and no
372 |      failure to comply consented to unless expressly agreed to by the
373 |      Licensor.
374 | 
375 |   d. Nothing in this Public License constitutes or may be interpreted
376 |      as a limitation upon, or waiver of, any privileges and immunities
377 |      that apply to the Licensor or You, including from the legal
378 |      processes of any jurisdiction or authority.
379 | 
380 | =======================================================================
381 | 
382 | Creative Commons is not a party to its public
383 | licenses. Notwithstanding, Creative Commons may elect to apply one of
384 | its public licenses to material it publishes and in those instances
385 | will be considered the “Licensor.” The text of the Creative Commons
386 | public licenses is dedicated to the public domain under the CC0 Public
387 | Domain Dedication. Except for the limited purpose of indicating that
388 | material is shared under a Creative Commons public license or as
389 | otherwise permitted by the Creative Commons policies published at
390 | creativecommons.org/policies, Creative Commons does not authorize the
391 | use of the trademark "Creative Commons" or any other trademark or logo
392 | of Creative Commons without its prior written consent including,
393 | without limitation, in connection with any unauthorized modifications
394 | to any of its public licenses or any other arrangements,
395 | understandings, or agreements concerning use of licensed material. For
396 | the avoidance of doubt, this paragraph does not form part of the
397 | public licenses.
398 | 
399 | Creative Commons may be contacted at creativecommons.org.
400 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | 
 2 | # Feature Denoising for Improving Adversarial Robustness
 3 | 
 4 | Code and models for the paper [Feature Denoising for Improving Adversarial Robustness](https://arxiv.org/abs/1812.03411), CVPR2019.
 5 | 
 6 | ## Introduction
 7 | 
 8 | <div align="center">
 9 |   <img src="teaser.jpg" width="700px" />
10 | </div>
11 | 
12 | By combining large-scale adversarial training and feature-denoising layers,
13 | we developed ImageNet classifiers with strong adversarial robustness.
14 | 
15 | Trained on __128 GPUs__, our ImageNet classifier has 42.6% accuracy against an extremely strong
16 | __2000-steps white-box__ PGD targeted attack.
17 | This is a scenario where no previous models have achieved more than 1% accuracy.
18 | 
19 | On black-box adversarial defense, our method won the __champion of defense track__ in the
20 | [CAAD (Competition of Adversarial Attacks and Defenses) 2018](http://hof.geekpwn.org/caad/en/index.html).
21 | It also greatly outperforms the [CAAD 2017](https://www.kaggle.com/c/nips-2017-defense-against-adversarial-attack) defense track winner when evaluated
22 | against CAAD 2017 black-box attackers.
23 | 
24 | This repo contains:
25 | 
26 | 1. Our trained models, together with the evaluation script to verify their robustness.
27 |    We welcome attackers to attack our released models and defenders to compare with our released models.
28 | 
29 | 2. Our distributed adversarial training code on ImageNet.
30 | 
31 | Please see [INSTRUCTIONS.md](INSTRUCTIONS.md) for the usage.
32 | 
33 | ## License
34 | 
35 | This project is under the CC-BY-NC 4.0 license. See [LICENSE](LICENSE) for details.
36 | 
37 | ## Citation
38 | 
39 | If you use our code, models or wish to refer to our results, please use the following BibTex entry:
40 | ```
41 | @InProceedings{Xie_2019_CVPR,
42 |   author = {Xie, Cihang and Wu, Yuxin and van der Maaten, Laurens and Yuille, Alan L. and He, Kaiming},
43 |   title = {Feature Denoising for Improving Adversarial Robustness},
44 |   booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
45 |   month = {June},
46 |   year = {2019}
47 | }
48 | ```
49 | 


--------------------------------------------------------------------------------
/adv_model.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # All rights reserved.
  3 | #
  4 | # This source code is licensed under the license found in the
  5 | # LICENSE file in the root directory of this source tree.
  6 | 
  7 | import tensorflow as tf
  8 | 
  9 | from tensorpack.models import regularize_cost, BatchNorm
 10 | from tensorpack.tfutils.summary import add_moving_summary
 11 | from tensorpack.tfutils import argscope
 12 | from tensorpack.tfutils.tower import TowerFunc
 13 | from tensorpack.utils import logger
 14 | from tensorpack.utils.argtools import log_once
 15 | from tensorpack.tfutils.collection import freeze_collection
 16 | from tensorpack.tfutils.varreplace import custom_getter_scope
 17 | 
 18 | from third_party.imagenet_utils import ImageNetModel
 19 | 
 20 | 
 21 | IMAGE_SCALE = 2.0 / 255
 22 | 
 23 | 
 24 | class NoOpAttacker():
 25 |     """
 26 |     A placeholder attacker which does nothing.
 27 |     """
 28 |     def attack(self, image, label, model_func):
 29 |         return image, -tf.ones_like(label)
 30 | 
 31 | 
 32 | class PGDAttacker():
 33 |     """
 34 |     A PGD white-box attacker with random target label.
 35 |     """
 36 | 
 37 |     USE_FP16 = False
 38 |     """
 39 |     Use FP16 to run PGD iterations.
 40 |     This has about 2~3x speedup for most types of models
 41 |     if used together with XLA on Volta GPUs.
 42 |     """
 43 | 
 44 |     USE_XLA = False
 45 |     """
 46 |     Use XLA to optimize the graph of PGD iterations.
 47 |     This requires CUDA>=9.2 and TF>=1.12.
 48 |     """
 49 | 
 50 |     def __init__(self, num_iter, epsilon, step_size, prob_start_from_clean=0.0):
 51 |         """
 52 |         Args:
 53 |             num_iter (int):
 54 |             epsilon (float):
 55 |             step_size (int):
 56 |             prob_start_from_clean (float): The probability to initialize with
 57 |                 the original image, rather than a randomly perturbed one.
 58 |         """
 59 |         step_size = max(step_size, epsilon / num_iter)
 60 |         """
 61 |         Feature Denoising, Sec 6.1:
 62 |         We set its step size α = 1, except for 10-iteration attacks where α is set to ε/10=1.6
 63 |         """
 64 |         self.num_iter = num_iter
 65 |         # rescale the attack epsilon and attack step size
 66 |         self.epsilon = epsilon * IMAGE_SCALE
 67 |         self.step_size = step_size * IMAGE_SCALE
 68 |         self.prob_start_from_clean = prob_start_from_clean
 69 | 
 70 |     def _create_random_target(self, label):
 71 |         """
 72 |         Feature Denoising Sec 6:
 73 |         we consider targeted attacks when
 74 |         evaluating under the white-box settings, where the targeted
 75 |         class is selected uniformly at random
 76 |         """
 77 |         label_offset = tf.random_uniform(tf.shape(label), minval=1, maxval=1000, dtype=tf.int32)
 78 |         return tf.floormod(label + label_offset, tf.constant(1000, tf.int32))
 79 | 
 80 |     def attack(self, image_clean, label, model_func):
 81 |         target_label = self._create_random_target(label)
 82 | 
 83 |         def fp16_getter(getter, *args, **kwargs):
 84 |             name = args[0] if len(args) else kwargs['name']
 85 |             if not name.endswith('/W') and not name.endswith('/b'):
 86 |                 """
 87 |                 Following convention, convolution & fc are quantized.
 88 |                 BatchNorm (gamma & beta) are not quantized.
 89 |                 """
 90 |                 return getter(*args, **kwargs)
 91 |             else:
 92 |                 if kwargs['dtype'] == tf.float16:
 93 |                     kwargs['dtype'] = tf.float32
 94 |                     ret = getter(*args, **kwargs)
 95 |                     ret = tf.cast(ret, tf.float16)
 96 |                     log_once("Variable {} casted to fp16 ...".format(name))
 97 |                     return ret
 98 |                 else:
 99 |                     return getter(*args, **kwargs)
100 | 
101 |         def one_step_attack(adv):
102 |             if not self.USE_FP16:
103 |                 logits = model_func(adv)
104 |             else:
105 |                 adv16 = tf.cast(adv, tf.float16)
106 |                 with custom_getter_scope(fp16_getter):
107 |                     logits = model_func(adv16)
108 |                     logits = tf.cast(logits, tf.float32)
109 |             # Note we don't add any summaries here when creating losses, because
110 |             # summaries don't work in conditionals.
111 |             losses = tf.nn.sparse_softmax_cross_entropy_with_logits(
112 |                 logits=logits, labels=target_label)  # we want to minimize it in targeted attack
113 |             if not self.USE_FP16:
114 |                 g, = tf.gradients(losses, adv)
115 |             else:
116 |                 """
117 |                 We perform loss scaling to prevent underflow:
118 |                 https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html
119 |                 (We have not yet tried training without scaling)
120 |                 """
121 |                 g, = tf.gradients(losses * 128., adv)
122 |                 g = g / 128.
123 | 
124 |             """
125 |             Feature Denoising, Sec 5:
126 |             We use the Projected Gradient Descent (PGD)
127 |             (implemented at https://github.com/MadryLab/cifar10_challenge )
128 |             as the white-box attacker for adversarial training
129 |             """
130 |             adv = tf.clip_by_value(adv - tf.sign(g) * self.step_size, lower_bound, upper_bound)
131 |             return adv
132 | 
133 |         """
134 |         Feature Denoising, Sec 6:
135 |         Adversarial perturbation is considered under L∞ norm (i.e., maximum difference for each pixel).
136 |         """
137 |         lower_bound = tf.clip_by_value(image_clean - self.epsilon, -1., 1.)
138 |         upper_bound = tf.clip_by_value(image_clean + self.epsilon, -1., 1.)
139 | 
140 |         """
141 |         Feature Denoising Sec. 5:
142 |         We randomly choose from both initializations in the
143 |         PGD attacker during adversarial training: 20% of training
144 |         batches use clean images to initialize PGD, and 80% use
145 |         random points within the allowed .
146 |         """
147 |         init_start = tf.random_uniform(tf.shape(image_clean), minval=-self.epsilon, maxval=self.epsilon)
148 |         start_from_noise_index = tf.cast(tf.greater(
149 |             tf.random_uniform(shape=[]), self.prob_start_from_clean), tf.float32)
150 |         start_adv = image_clean + start_from_noise_index * init_start
151 | 
152 |         if self.USE_XLA:
153 |             assert tuple(map(int, tf.__version__.split('.')[:2])) >= (1, 12)
154 |             from tensorflow.contrib.compiler import xla
155 |         with tf.name_scope('attack_loop'):
156 |             adv_final = tf.while_loop(
157 |                 lambda _: True,
158 |                 one_step_attack if not self.USE_XLA else
159 |                 lambda adv: xla.compile(lambda: one_step_attack(adv))[0],
160 |                 [start_adv],
161 |                 back_prop=False,
162 |                 maximum_iterations=self.num_iter,
163 |                 parallel_iterations=1)
164 |         return adv_final, target_label
165 | 
166 | 
167 | class AdvImageNetModel(ImageNetModel):
168 | 
169 |     """
170 |     Feature Denoising, Sec 5:
171 |     A label smoothing of 0.1 is used.
172 |     """
173 |     label_smoothing = 0.1
174 | 
175 |     def set_attacker(self, attacker):
176 |         self.attacker = attacker
177 | 
178 |     def build_graph(self, image, label):
179 |         """
180 |         The default tower function.
181 |         """
182 |         image = self.image_preprocess(image)
183 |         assert self.data_format == 'NCHW'
184 |         image = tf.transpose(image, [0, 3, 1, 2])
185 | 
186 |         with tf.variable_scope(tf.get_variable_scope(), reuse=tf.AUTO_REUSE):
187 |             # BatchNorm always comes with trouble. We use the testing mode of it during attack.
188 |             with freeze_collection([tf.GraphKeys.UPDATE_OPS]), argscope(BatchNorm, training=False):
189 |                 image, target_label = self.attacker.attack(image, label, self.get_logits)
190 |                 image = tf.stop_gradient(image, name='adv_training_sample')
191 | 
192 |             logits = self.get_logits(image)
193 | 
194 |         loss = ImageNetModel.compute_loss_and_error(
195 |             logits, label, label_smoothing=self.label_smoothing)
196 |         AdvImageNetModel.compute_attack_success(logits, target_label)
197 |         if not self.training:
198 |             return
199 | 
200 |         wd_loss = regularize_cost(self.weight_decay_pattern,
201 |                                   tf.contrib.layers.l2_regularizer(self.weight_decay),
202 |                                   name='l2_regularize_loss')
203 |         add_moving_summary(loss, wd_loss)
204 |         total_cost = tf.add_n([loss, wd_loss], name='cost')
205 | 
206 |         if self.loss_scale != 1.:
207 |             logger.info("Scaling the total loss by {} ...".format(self.loss_scale))
208 |             return total_cost * self.loss_scale
209 |         else:
210 |             return total_cost
211 | 
212 |     def get_inference_func(self, attacker):
213 |         """
214 |         Returns a tower function to be used for inference. It generates adv
215 |         images with the given attacker and runs classification on it.
216 |         """
217 | 
218 |         def tower_func(image, label):
219 |             assert not self.training
220 |             image = self.image_preprocess(image)
221 |             image = tf.transpose(image, [0, 3, 1, 2])
222 |             image, target_label = attacker.attack(image, label, self.get_logits)
223 |             logits = self.get_logits(image)
224 |             ImageNetModel.compute_loss_and_error(logits, label)  # compute top-1 and top-5
225 |             AdvImageNetModel.compute_attack_success(logits, target_label)
226 | 
227 |         return TowerFunc(tower_func, self.get_input_signature())
228 | 
229 |     def image_preprocess(self, image):
230 |         with tf.name_scope('image_preprocess'):
231 |             if image.dtype.base_dtype != tf.float32:
232 |                 image = tf.cast(image, tf.float32)
233 |             # For the purpose of adversarial training, normalize images to [-1, 1]
234 |             image = image * IMAGE_SCALE - 1.0
235 |             return image
236 | 
237 |     @staticmethod
238 |     def compute_attack_success(logits, target_label):
239 |         """
240 |         Compute the attack success rate.
241 |         """
242 |         pred = tf.argmax(logits, axis=1, output_type=tf.int32)
243 |         equal_target = tf.equal(pred, target_label)
244 |         success = tf.cast(equal_target, tf.float32, name='attack_success')
245 |         add_moving_summary(tf.reduce_mean(success, name='attack_success_rate'))
246 | 


--------------------------------------------------------------------------------
/inference-example.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # -*- coding: utf-8 -*-
 3 | # Copyright (c) Facebook, Inc. and its affiliates.
 4 | # All rights reserved.
 5 | #
 6 | # This source code is licensed under the license found in the
 7 | # LICENSE file in the root directory of this source tree.
 8 | 
 9 | import argparse
10 | import numpy as np
11 | import cv2
12 | import tensorflow as tf
13 | 
14 | from tensorpack import TowerContext
15 | from tensorpack.tfutils import get_model_loader
16 | from tensorpack.dataflow.dataset import ILSVRCMeta
17 | 
18 | import nets
19 | 
20 | """
21 | A small inference example for attackers to play with.
22 | """
23 | 
24 | 
25 | parser = argparse.ArgumentParser()
26 | parser.add_argument('-d', '--depth', help='ResNet depth',
27 |                     type=int, default=152, choices=[50, 101, 152])
28 | parser.add_argument('--arch', help='Name of architectures defined in nets.py',
29 |                     default='ResNetDenoise')
30 | parser.add_argument('--load', help='path to checkpoint')
31 | parser.add_argument('--input', help='path to input image')
32 | args = parser.parse_args()
33 | 
34 | model = getattr(nets, args.arch + 'Model')(args)
35 | 
36 | input = tf.placeholder(tf.float32, shape=(None, 224, 224, 3))
37 | image = input / 127.5 - 1.0
38 | image = tf.transpose(image, [0, 3, 1, 2])
39 | with TowerContext('', is_training=False):
40 |     logits = model.get_logits(image)
41 | 
42 | sess = tf.Session()
43 | get_model_loader(args.load).init(sess)
44 | 
45 | sample = cv2.imread(args.input)  # this is a BGR image, not RGB
46 | # imagenet evaluation uses standard imagenet pre-processing
47 | # (resize shortest edge to 256 + center crop 224).
48 | # However, for images of unknown sources, let's just do a naive resize.
49 | sample = cv2.resize(sample, (224, 224))
50 | 
51 | prob = sess.run(logits, feed_dict={input: np.array([sample])})
52 | print("Prediction: ", prob.argmax())
53 | 
54 | synset = ILSVRCMeta().get_synset_words_1000()
55 | print("Top 5: ", [synset[k] for k in prob[0].argsort()[-5:][::-1]])
56 | 


--------------------------------------------------------------------------------
/main.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | # Copyright (c) Facebook, Inc. and its affiliates.
  4 | # All rights reserved.
  5 | #
  6 | # This source code is licensed under the license found in the
  7 | # LICENSE file in the root directory of this source tree.
  8 | 
  9 | 
 10 | import argparse
 11 | import cv2
 12 | import glob
 13 | import numpy as np
 14 | import os
 15 | import socket
 16 | import sys
 17 | 
 18 | import horovod.tensorflow as hvd
 19 | 
 20 | from tensorpack import *
 21 | from tensorpack.tfutils import SmartInit
 22 | 
 23 | import nets
 24 | from adv_model import NoOpAttacker, PGDAttacker
 25 | from third_party.imagenet_utils import get_val_dataflow, eval_on_ILSVRC12
 26 | from third_party.utils import HorovodClassificationError
 27 | 
 28 | 
 29 | def create_eval_callback(name, tower_func, condition):
 30 |     """
 31 |     Create a distributed evaluation callback.
 32 | 
 33 |     Args:
 34 |         name (str): a prefix
 35 |         tower_func (TowerFunc): the inference tower function
 36 |         condition: a function(epoch number) that returns whether this epoch should evaluate or not
 37 |     """
 38 |     dataflow = get_val_dataflow(
 39 |         args.data, args.batch,
 40 |         num_splits=hvd.size(), split_index=hvd.rank())
 41 |     # We eval both the classification error rate (for comparison with defenders)
 42 |     # and the attack success rate (for comparison with attackers).
 43 |     infs = [HorovodClassificationError('wrong-top1', '{}-top1-error'.format(name)),
 44 |             HorovodClassificationError('wrong-top5', '{}-top5-error'.format(name)),
 45 |             HorovodClassificationError('attack_success', '{}-attack-success-rate'.format(name))
 46 |             ]
 47 |     cb = InferenceRunner(
 48 |         QueueInput(dataflow), infs,
 49 |         tower_name=name,
 50 |         tower_func=tower_func).set_chief_only(False)
 51 |     cb = EnableCallbackIf(
 52 |         cb, lambda self: condition(self.epoch_num))
 53 |     return cb
 54 | 
 55 | 
 56 | def do_train(model):
 57 |     batch = args.batch
 58 |     total_batch = batch * hvd.size()
 59 | 
 60 |     if args.fake:
 61 |         data = FakeData(
 62 |             [[batch, 224, 224, 3], [batch]], 1000,
 63 |             random=False, dtype=['uint8', 'int32'])
 64 |         data = StagingInput(QueueInput(data))
 65 |         callbacks = []
 66 |         steps_per_epoch = 50
 67 |     else:
 68 |         logger.info("#Tower: {}; Batch size per tower: {}".format(hvd.size(), batch))
 69 |         zmq_addr = 'ipc://@imagenet-train-b{}'.format(batch)
 70 |         if args.no_zmq_ops:
 71 |             dataflow = RemoteDataZMQ(zmq_addr, hwm=150, bind=False)
 72 |             data = QueueInput(dataflow)
 73 |         else:
 74 |             data = ZMQInput(zmq_addr, 30, bind=False)
 75 |         data = StagingInput(data)
 76 | 
 77 |         steps_per_epoch = int(np.round(1281167 / total_batch))
 78 | 
 79 |     BASE_LR = 0.1 * (total_batch // 256)
 80 |     """
 81 |     ImageNet in 1 Hour, Sec 2.1:
 82 |     Linear Scaling Rule: When the minibatch size is
 83 |     multiplied by k, multiply the learning rate by k.
 84 |     """
 85 |     logger.info("Base LR: {}".format(BASE_LR))
 86 |     callbacks = [
 87 |         ModelSaver(max_to_keep=10),
 88 |         EstimatedTimeLeft(),
 89 |         ScheduledHyperParamSetter(
 90 |             'learning_rate', [(0, BASE_LR), (35, BASE_LR * 1e-1), (70, BASE_LR * 1e-2),
 91 |                               (95, BASE_LR * 1e-3)])
 92 |     ]
 93 |     """
 94 |     Feature Denoising, Sec 5:
 95 |     Our models are trained for a total of
 96 |     110 epochs; we decrease the learning rate by 10× at the 35-
 97 |     th, 70-th, and 95-th epoch
 98 |     """
 99 |     max_epoch = 110
100 | 
101 |     if BASE_LR > 0.1:
102 |         callbacks.append(
103 |             ScheduledHyperParamSetter(
104 |                 'learning_rate', [(0, 0.1), (5 * steps_per_epoch, BASE_LR)],
105 |                 interp='linear', step_based=True))
106 |         """
107 |         ImageNet in 1 Hour, Sec 2.2:
108 |         we start from a learning rate of η and increment it by a constant amount at
109 |         each iteration such that it reaches ηˆ = kη after 5 epochs
110 |         """
111 | 
112 |     if not args.fake:
113 |         # add distributed evaluation, for various attackers that we care.
114 |         def add_eval_callback(name, attacker, condition):
115 |             cb = create_eval_callback(
116 |                 name,
117 |                 model.get_inference_func(attacker),
118 |                 # always eval in the last 2 epochs no matter what
119 |                 lambda epoch_num: condition(epoch_num) or epoch_num > max_epoch - 2)
120 |             callbacks.append(cb)
121 | 
122 |         add_eval_callback('eval-clean', NoOpAttacker(), lambda e: True)
123 |         add_eval_callback('eval-10step', PGDAttacker(10, args.attack_epsilon, args.attack_step_size),
124 |                           lambda e: True)
125 |         add_eval_callback('eval-50step', PGDAttacker(50, args.attack_epsilon, args.attack_step_size),
126 |                           lambda e: e % 20 == 0)
127 |         add_eval_callback('eval-100step', PGDAttacker(100, args.attack_epsilon, args.attack_step_size),
128 |                           lambda e: e % 10 == 0 or e > max_epoch - 5)
129 |         for k in [20, 30, 40, 60, 70, 80, 90]:
130 |             add_eval_callback('eval-{}step'.format(k),
131 |                               PGDAttacker(k, args.attack_epsilon, args.attack_step_size),
132 |                               lambda e: False)
133 | 
134 |     trainer = HorovodTrainer(average=True)
135 |     trainer.setup_graph(model.get_input_signature(), data, model.build_graph, model.get_optimizer)
136 |     trainer.train_with_defaults(
137 |         callbacks=callbacks,
138 |         steps_per_epoch=steps_per_epoch,
139 |         session_init=SmartInit(args.load),
140 |         max_epoch=max_epoch,
141 |         starting_epoch=args.starting_epoch)
142 | 
143 | 
144 | if __name__ == '__main__':
145 |     parser = argparse.ArgumentParser()
146 |     parser.add_argument('--load', help='Path to a model to load for evaluation or resuming training.')
147 |     parser.add_argument('--starting-epoch', help='The epoch to start with. Useful when resuming training.',
148 |                         type=int, default=1)
149 |     parser.add_argument('--logdir', help='Directory suffix for models and training stats.')
150 |     parser.add_argument('--eval', action='store_true', help='Evaluate a model on ImageNet instead of training.')
151 | 
152 |     # run on a directory of images:
153 |     parser.add_argument('--eval-directory', help='Path to a directory of images to classify.')
154 |     parser.add_argument('--prediction-file', help='Path to a txt file to write predictions.', default='predictions.txt')
155 | 
156 |     parser.add_argument('--data', help='ILSVRC dataset dir')
157 |     parser.add_argument('--fake', help='Use fakedata to test or benchmark this model', action='store_true')
158 |     parser.add_argument('--no-zmq-ops', help='Use pure python to send/receive data',
159 |                         action='store_true')
160 |     parser.add_argument('--batch', help='Per-GPU batch size', default=32, type=int)
161 | 
162 |     # attacker flags:
163 |     parser.add_argument('--attack-iter', help='Adversarial attack iteration',
164 |                         type=int, default=30)
165 |     parser.add_argument('--attack-epsilon', help='Adversarial attack maximal perturbation',
166 |                         type=float, default=16.0)
167 |     parser.add_argument('--attack-step-size', help='Adversarial attack step size',
168 |                         type=float, default=1.0)
169 |     parser.add_argument('--use-fp16xla',
170 |                         help='Optimize PGD with fp16+XLA in training or evaluation. '
171 |                         '(Evaluation during training will still use FP32, for fair comparison)',
172 |                         action='store_true')
173 | 
174 |     # architecture flags:
175 |     parser.add_argument('-d', '--depth', help='ResNet depth',
176 |                         type=int, default=50, choices=[50, 101, 152])
177 |     parser.add_argument('--arch', help='Name of architectures defined in nets.py',
178 |                         default='ResNet')
179 |     args = parser.parse_args()
180 | 
181 |     # Define model
182 |     model = getattr(nets, args.arch + 'Model')(args)
183 | 
184 |     # Define attacker
185 |     if args.attack_iter == 0 or args.eval_directory:
186 |         attacker = NoOpAttacker()
187 |     else:
188 |         attacker = PGDAttacker(
189 |             args.attack_iter, args.attack_epsilon, args.attack_step_size,
190 |             prob_start_from_clean=0.2 if not args.eval else 0.0)
191 |         if args.use_fp16xla:
192 |             attacker.USE_FP16 = True
193 |             attacker.USE_XLA = True
194 |     model.set_attacker(attacker)
195 | 
196 |     os.system("nvidia-smi")
197 |     hvd.init()
198 | 
199 |     if args.eval:
200 |         sessinit = SmartInit(args.load)
201 |         if hvd.size() == 1:
202 |             # single-GPU eval, slow
203 |             ds = get_val_dataflow(args.data, args.batch)
204 |             eval_on_ILSVRC12(model, sessinit, ds)
205 |         else:
206 |             logger.info("CMD: " + " ".join(sys.argv))
207 |             cb = create_eval_callback(
208 |                 "eval",
209 |                 model.get_inference_func(attacker),
210 |                 lambda e: True)
211 |             trainer = HorovodTrainer()
212 |             trainer.setup_graph(model.get_input_signature(), PlaceholderInput(), model.build_graph, model.get_optimizer)
213 |             # train for an empty epoch, to reuse the distributed evaluation code
214 |             trainer.train_with_defaults(
215 |                 callbacks=[cb],
216 |                 monitors=[ScalarPrinter()] if hvd.rank() == 0 else [],
217 |                 session_init=sessinit,
218 |                 steps_per_epoch=0, max_epoch=1)
219 |     elif args.eval_directory:
220 |         assert hvd.size() == 1
221 |         files = glob.glob(os.path.join(args.eval_directory, '*.*'))
222 |         ds = ImageFromFile(files)
223 |         # Our model expects BGR images instead of RGB.
224 |         # Also do a naive resize to 224.
225 |         ds = MapData(
226 |             ds,
227 |             lambda dp: [cv2.resize(dp[0][:, :, ::-1], (224, 224), interpolation=cv2.INTER_CUBIC)])
228 |         ds = BatchData(ds, args.batch, remainder=True)
229 | 
230 |         pred_config = PredictConfig(
231 |             model=model,
232 |             session_init=SmartInit(args.load),
233 |             input_names=['input'],
234 |             output_names=['linear/output']  # the logits
235 |         )
236 |         predictor = SimpleDatasetPredictor(pred_config, ds)
237 | 
238 |         logger.info("Running inference on {} images in {}".format(len(files), args.eval_directory))
239 |         results = []
240 |         for logits, in predictor.get_result():
241 |             predictions = list(np.argmax(logits, axis=1))
242 |             results.extend(predictions)
243 |         assert len(results) == len(files)
244 |         with open(args.prediction_file, "w") as f:
245 |             for filename, label in zip(files, results):
246 |                 f.write("{},{}\n".format(filename, label))
247 |         logger.info("Outputs saved to " + args.prediction_file)
248 |     else:
249 |         logger.info("Training on {}".format(socket.gethostname()))
250 |         logdir = os.path.join(
251 |             'train_log',
252 |             'PGD-{}{}-Batch{}-{}GPUs-iter{}-epsilon{}-step{}{}'.format(
253 |                 args.arch, args.depth, args.batch, hvd.size(),
254 |                 args.attack_iter, args.attack_epsilon, args.attack_step_size,
255 |                 '-' + args.logdir if args.logdir else ''))
256 | 
257 |         if hvd.rank() == 0:
258 |             # old log directory will be automatically removed.
259 |             logger.set_logger_dir(logdir, 'd')
260 |         logger.info("CMD: " + " ".join(sys.argv))
261 |         logger.info("Rank={}, Local Rank={}, Size={}".format(hvd.rank(), hvd.local_rank(), hvd.size()))
262 | 
263 |         do_train(model)
264 | 


--------------------------------------------------------------------------------
/nets.py:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # All rights reserved.
 3 | #
 4 | # This source code is licensed under the license found in the
 5 | # LICENSE file in the root directory of this source tree.
 6 | 
 7 | from adv_model import AdvImageNetModel
 8 | from resnet_model import (
 9 |     resnet_group, resnet_bottleneck, resnet_backbone)
10 | from resnet_model import denoising
11 | 
12 | 
13 | NUM_BLOCKS = {
14 |     50: [3, 4, 6, 3],
15 |     101: [3, 4, 23, 3],
16 |     152: [3, 8, 36, 3]
17 | }
18 | 
19 | 
20 | class ResNetModel(AdvImageNetModel):
21 |     def __init__(self, args):
22 |         self.num_blocks = NUM_BLOCKS[args.depth]
23 | 
24 |     def get_logits(self, image):
25 |         return resnet_backbone(image, self.num_blocks, resnet_group, resnet_bottleneck)
26 | 
27 | 
28 | class ResNetDenoiseModel(AdvImageNetModel):
29 |     def __init__(self, args):
30 |         self.num_blocks = NUM_BLOCKS[args.depth]
31 | 
32 |     def get_logits(self, image):
33 | 
34 |         def group_func(name, *args):
35 |             """
36 |             Feature Denoising, Sec 6:
37 |             we add 4 denoising blocks to a ResNet: each is added after the
38 |             last residual block of res2, res3, res4, and res5, respectively.
39 |             """
40 |             l = resnet_group(name, *args)
41 |             l = denoising(name + '_denoise', l, embed=True, softmax=True)
42 |             return l
43 | 
44 |         return resnet_backbone(image, self.num_blocks, group_func, resnet_bottleneck)
45 | 
46 | 
47 | class ResNeXtDenoiseAllModel(AdvImageNetModel):
48 |     """
49 |     ResNeXt 32x8d that performs denoising after every residual block.
50 |     """
51 |     def __init__(self, args):
52 |         self.num_blocks = NUM_BLOCKS[args.depth]
53 | 
54 |     def get_logits(self, image):
55 | 
56 |         def block_func(l, ch_out, stride):
57 |             """
58 |             Feature Denoising, Sec 6.2:
59 |             The winning entry, shown in the blue bar, was based on our method by using
60 |             a ResNeXt101-32×8 backbone
61 |             with non-local denoising blocks added to all residual blocks.
62 |             """
63 |             l = resnet_bottleneck(l, ch_out, stride, group=32, res2_bottleneck=8)
64 |             l = denoising('non_local', l, embed=False, softmax=False)
65 |             return l
66 | 
67 |         return resnet_backbone(image, self.num_blocks, resnet_group, block_func)
68 | 


--------------------------------------------------------------------------------
/resnet_model.py:
--------------------------------------------------------------------------------
  1 | # Copyright (c) Facebook, Inc. and its affiliates.
  2 | # All rights reserved.
  3 | #
  4 | # This source code is licensed under the license found in the
  5 | # LICENSE file in the root directory of this source tree.
  6 | 
  7 | import tensorflow as tf
  8 | from tensorpack.tfutils.argscope import argscope
  9 | from tensorpack.models import (
 10 |     Conv2D, MaxPooling, AvgPooling, GlobalAvgPooling, BatchNorm, FullyConnected, BNReLU)
 11 | 
 12 | 
 13 | def resnet_shortcut(l, n_out, stride, activation=tf.identity):
 14 |     n_in = l.get_shape().as_list()[1]
 15 |     if n_in != n_out:   # change dimension when channel is not the same
 16 |         return Conv2D('convshortcut', l, n_out, 1, strides=stride, activation=activation)
 17 |     else:
 18 |         return l
 19 | 
 20 | 
 21 | def get_bn(zero_init=False):
 22 |     if zero_init:
 23 |         return lambda x, name=None: BatchNorm('bn', x, gamma_initializer=tf.zeros_initializer())
 24 |     else:
 25 |         return lambda x, name=None: BatchNorm('bn', x)
 26 | 
 27 | 
 28 | def resnet_bottleneck(l, ch_out, stride, group=1, res2_bottleneck=64):
 29 |     """
 30 |     Args:
 31 |         group (int): the number of groups for resnext
 32 |         res2_bottleneck (int): the number of channels in res2 bottleneck.
 33 |     The default corresponds to ResNeXt 1x64d, i.e. vanilla ResNet.
 34 |     """
 35 |     ch_factor = res2_bottleneck * group // 64
 36 |     shortcut = l
 37 |     l = Conv2D('conv1', l, ch_out * ch_factor, 1, strides=1, activation=BNReLU)
 38 |     l = Conv2D('conv2', l, ch_out * ch_factor, 3, strides=stride, activation=BNReLU, split=group)
 39 |     """
 40 |     ImageNet in 1 Hour, Sec 5.1:
 41 |     the stride-2 convolutions are on 3×3 layers instead of on 1×1 layers
 42 |     """
 43 |     l = Conv2D('conv3', l, ch_out * 4, 1, activation=get_bn(zero_init=True))
 44 |     """
 45 |     ImageNet in 1 Hour, Sec 5.1: each residual block's last BN where γ is initialized to be 0
 46 |     """
 47 |     ret = l + resnet_shortcut(shortcut, ch_out * 4, stride, activation=get_bn(zero_init=False))
 48 |     return tf.nn.relu(ret, name='block_output')
 49 | 
 50 | 
 51 | def resnet_group(name, l, block_func, features, count, stride):
 52 |     with tf.variable_scope(name):
 53 |         for i in range(0, count):
 54 |             with tf.variable_scope('block{}'.format(i)):
 55 |                 current_stride = stride if i == 0 else 1
 56 |                 l = block_func(l, features, current_stride)
 57 |     return l
 58 | 
 59 | 
 60 | def resnet_backbone(image, num_blocks, group_func, block_func):
 61 |     with argscope([Conv2D, MaxPooling, AvgPooling, GlobalAvgPooling, BatchNorm], data_format='NCHW'), \
 62 |             argscope(Conv2D, use_bias=False,
 63 |                      kernel_initializer=tf.variance_scaling_initializer(scale=2.0, mode='fan_out')):
 64 |         l = Conv2D('conv0', image, 64, 7, strides=2, activation=BNReLU)
 65 |         l = MaxPooling('pool0', l, pool_size=3, strides=2, padding='SAME')
 66 |         l = group_func('group0', l, block_func, 64, num_blocks[0], 1)
 67 |         l = group_func('group1', l, block_func, 128, num_blocks[1], 2)
 68 |         l = group_func('group2', l, block_func, 256, num_blocks[2], 2)
 69 |         l = group_func('group3', l, block_func, 512, num_blocks[3], 2)
 70 |         l = GlobalAvgPooling('gap', l)
 71 |         logits = FullyConnected('linear', l, 1000,
 72 |                                 kernel_initializer=tf.random_normal_initializer(stddev=0.01))
 73 |         """
 74 |         ImageNet in 1 Hour, Sec 5.1:
 75 |         The 1000-way fully-connected layer is initialized by
 76 |         drawing weights from a zero-mean Gaussian with standard deviation of 0.01
 77 |         """
 78 |     return logits
 79 | 
 80 | 
 81 | def denoising(name, l, embed=True, softmax=True):
 82 |     """
 83 |     Feature Denoising, Fig 4 & 5.
 84 |     """
 85 |     with tf.variable_scope(name):
 86 |         f = non_local_op(l, embed=embed, softmax=softmax)
 87 |         f = Conv2D('conv', f, l.shape[1], 1, strides=1, activation=get_bn(zero_init=True))
 88 |         l = l + f
 89 |     return l
 90 | 
 91 | 
 92 | def non_local_op(l, embed, softmax):
 93 |     """
 94 |     Feature Denoising, Sec 4.2 & Fig 5.
 95 |     Args:
 96 |         embed (bool): whether to use embedding on theta & phi
 97 |         softmax (bool): whether to use gaussian (softmax) version or the dot-product version.
 98 |     """
 99 |     n_in, H, W = l.shape.as_list()[1:]
100 |     if embed:
101 |         theta = Conv2D('embedding_theta', l, n_in / 2, 1,
102 |                        strides=1, kernel_initializer=tf.random_normal_initializer(stddev=0.01))
103 |         phi = Conv2D('embedding_phi', l, n_in / 2, 1,
104 |                      strides=1, kernel_initializer=tf.random_normal_initializer(stddev=0.01))
105 |         g = l
106 |     else:
107 |         theta, phi, g = l, l, l
108 |     if n_in > H * W or softmax:
109 |         f = tf.einsum('niab,nicd->nabcd', theta, phi)
110 |         if softmax:
111 |             orig_shape = tf.shape(f)
112 |             f = tf.reshape(f, [-1, H * W, H * W])
113 |             f = f / tf.sqrt(tf.cast(theta.shape[1], theta.dtype))
114 |             f = tf.nn.softmax(f)
115 |             f = tf.reshape(f, orig_shape)
116 |         f = tf.einsum('nabcd,nicd->niab', f, g)
117 |     else:
118 |         f = tf.einsum('nihw,njhw->nij', phi, g)
119 |         f = tf.einsum('nij,nihw->njhw', f, theta)
120 |     if not softmax:
121 |         f = f / tf.cast(H * W, f.dtype)
122 |     return tf.reshape(f, tf.shape(l))
123 | 


--------------------------------------------------------------------------------
/slurm/eval.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash -e
 2 | # Copyright (c) Facebook, Inc. and its affiliates.
 3 | # All rights reserved.
 4 | #
 5 | # This source code is licensed under the license found in the
 6 | # LICENSE file in the root directory of this source tree.
 7 | #SBATCH --output=logs/job-%j.%N.out
 8 | #SBATCH --error=logs/job-%j.%N.err
 9 | #SBATCH --ntasks-per-node=8  # 8 tasks per node
10 | #SBATCH --gres=gpu:8		 # 8 GPUs per node
11 | #SBATCH --cpus-per-task=10   # 80/8 cpus per task
12 | #SBATCH --mem=200G	 # ask for 200G
13 | 
14 | # To run on 4 nodes x 8 GPUs: use "mkdir -p logs && sbatch --nodes=4 slurm.script"
15 | 
16 | echo "NNODES: $SLURM_NNODES"
17 | echo "JOBID: $SLURM_JOB_ID"
18 | env | grep PATH
19 | 
20 | export TENSORPACK_PROGRESS_REFRESH=20
21 | export TENSORPACK_SERIALIZE=msgpack
22 | 
23 | DATA_PATH=~/data/imagenet
24 | BATCH=32
25 | CONFIG=$1
26 | 
27 | # launch eval
28 | # https://www.open-mpi.org/faq/?category=openfabrics#ib-router has document on IB options
29 | # the queue parameters sometimes can hang the communication (for some MPI versions and some operations)
30 | mpirun -output-filename logs/eval-$SLURM_JOB_ID.log -tag-output \
31 | 	-bind-to none -map-by slot \
32 | 	-mca pml ob1 -mca btl_openib_receive_queues P,128,32:P,2048,32:P,12288,32:P,65536,32 \
33 | 	-x NCCL_IB_CUDA_SUPPORT=1 -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO \
34 | 	python ./main.py --eval --data $DATA_PATH --batch $BATCH $CONFIG
35 | 


--------------------------------------------------------------------------------
/slurm/train.sh:
--------------------------------------------------------------------------------
 1 | #!/bin/bash -e
 2 | # Copyright (c) Facebook, Inc. and its affiliates.
 3 | # All rights reserved.
 4 | #
 5 | # This source code is licensed under the license found in the
 6 | # LICENSE file in the root directory of this source tree.
 7 | #SBATCH --output=logs/job-%j.%N.out
 8 | #SBATCH --error=logs/job-%j.%N.err
 9 | #SBATCH --ntasks-per-node=8  # 8 tasks per node
10 | #SBATCH --gres=gpu:8		 # 8 GPUs per node
11 | #SBATCH --cpus-per-task=10   # 80/8 cpus per task
12 | #SBATCH --mem=200G
13 | 
14 | # To run on 4 nodes x 8 GPUs: use "mkdir -p logs && sbatch --nodes=4 slurm.script"
15 | 
16 | echo "NNODES: $SLURM_NNODES"
17 | echo "JOBID: $SLURM_JOB_ID"
18 | env | grep PATH
19 | 
20 | export TENSORPACK_PROGRESS_REFRESH=20
21 | export TENSORPACK_SERIALIZE=msgpack
22 | 
23 | DATA_PATH=~/data/imagenet
24 | BATCH=32
25 | CONFIG=$1
26 | 
27 | # launch data
28 | srun --output=logs/data-%J.%N.log \
29 | 		 --error=logs/data-%J.%N.err \
30 | 		 --gres=gpu:0 --cpus-per-task=60 --mincpus 60 \
31 | 	--ntasks=$SLURM_NNODES --ntasks-per-node=1 \
32 | 	python ./third_party/serve-data.py --data $DATA_PATH --batch $BATCH &
33 | DATA_PID=$!
34 | 
35 | # launch training
36 | # https://www.open-mpi.org/faq/?category=openfabrics#ib-router has document on IB options
37 | # the queue parameters sometimes can hang the communication (for some MPI versions and some operations)
38 | 	#-mca btl tcp,self \
39 | mpirun -output-filename logs/train-$SLURM_JOB_ID.log -tag-output \
40 | 	-bind-to none -map-by slot \
41 | 	-mca pml ob1 -mca btl_openib_receive_queues P,128,32:P,2048,32:P,12288,32:P,65536,32 \
42 | 	-x NCCL_IB_CUDA_SUPPORT=1 -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO \
43 | 	python ./main.py --data $DATA_PATH --batch $BATCH $CONFIG &
44 | MPI_PID=$!
45 | 
46 | wait $MPI_PID
47 | 


--------------------------------------------------------------------------------
/teaser.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/facebookresearch/ImageNet-Adversarial-Training/1ad68f08b8533083b0b8823ac3fd85cede191646/teaser.jpg


--------------------------------------------------------------------------------
/third_party/README.md:
--------------------------------------------------------------------------------
1 | 
2 | Utilities for ImageNet training & distributed evaluation.
3 | 
4 | Copied from https://github.com/tensorpack/benchmarks/tree/master/ResNet-Horovod.
5 | 


--------------------------------------------------------------------------------
/third_party/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/facebookresearch/ImageNet-Adversarial-Training/1ad68f08b8533083b0b8823ac3fd85cede191646/third_party/__init__.py


--------------------------------------------------------------------------------
/third_party/imagenet_utils.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | # -*- coding: utf-8 -*-
  3 | # File: imagenet_utils.py
  4 | 
  5 | 
  6 | import multiprocessing
  7 | import numpy as np
  8 | from abc import abstractmethod
  9 | 
 10 | import cv2
 11 | import tensorflow as tf
 12 | 
 13 | from tensorpack import imgaug, dataset, ModelDesc
 14 | from tensorpack.dataflow import (
 15 |     BatchData, MultiThreadMapData, DataFromList)
 16 | from tensorpack.predict import PredictConfig, SimpleDatasetPredictor
 17 | from tensorpack.utils.stats import RatioCounter
 18 | from tensorpack.models import regularize_cost
 19 | from tensorpack.tfutils.summary import add_moving_summary
 20 | from tensorpack.utils import logger
 21 | 
 22 | 
 23 | def fbresnet_augmentor(isTrain):
 24 |     """
 25 |     Augmentor used in fb.resnet.torch, for BGR images in range [0,255].
 26 |     """
 27 |     if isTrain:
 28 |         augmentors = [
 29 |             imgaug.GoogleNetRandomCropAndResize(interp=cv2.INTER_CUBIC),
 30 |             # It's OK to remove the following augs if your CPU is not fast enough.
 31 |             # Removing brightness/contrast/saturation does not have a significant effect on accuracy.
 32 |             # Removing lighting leads to a tiny drop in accuracy.
 33 |             imgaug.RandomOrderAug(
 34 |                 [imgaug.BrightnessScale((0.6, 1.4), clip=False),
 35 |                  imgaug.Contrast((0.6, 1.4), clip=False),
 36 |                  imgaug.Saturation(0.4, rgb=False),
 37 |                  # rgb-bgr conversion for the constants copied from fb.resnet.torch
 38 |                  imgaug.Lighting(0.1,
 39 |                                  eigval=np.asarray(
 40 |                                      [0.2175, 0.0188, 0.0045][::-1]) * 255.0,
 41 |                                  eigvec=np.array(
 42 |                                      [[-0.5675, 0.7192, 0.4009],
 43 |                                       [-0.5808, -0.0045, -0.8140],
 44 |                                       [-0.5836, -0.6948, 0.4203]],
 45 |                                      dtype='float32')[::-1, ::-1]
 46 |                                  )]),
 47 |             imgaug.Flip(horiz=True),
 48 |         ]
 49 |     else:
 50 |         augmentors = [
 51 |             imgaug.ResizeShortestEdge(256, cv2.INTER_CUBIC),
 52 |             imgaug.CenterCrop((224, 224)),
 53 |         ]
 54 |     return augmentors
 55 | 
 56 | 
 57 | def get_val_dataflow(
 58 |         datadir, batch_size,
 59 |         augmentors=None, parallel=None,
 60 |         num_splits=None, split_index=None):
 61 |     if augmentors is None:
 62 |         augmentors = fbresnet_augmentor(False)
 63 |     assert datadir is not None
 64 |     assert isinstance(augmentors, list)
 65 |     if parallel is None:
 66 |         parallel = min(40, multiprocessing.cpu_count())
 67 | 
 68 |     if num_splits is None:
 69 |         ds = dataset.ILSVRC12Files(datadir, 'val', shuffle=False)
 70 |     else:
 71 |         # shard validation data
 72 |         assert split_index < num_splits
 73 |         files = dataset.ILSVRC12Files(datadir, 'val', shuffle=False)
 74 |         files.reset_state()
 75 |         files = list(files.get_data())
 76 |         logger.info("Number of validation data = {}".format(len(files)))
 77 |         split_size = len(files) // num_splits
 78 |         start, end = split_size * split_index, split_size * (split_index + 1)
 79 |         end = min(end, len(files))
 80 |         logger.info("Local validation split = {} - {}".format(start, end))
 81 |         files = files[start: end]
 82 |         ds = DataFromList(files, shuffle=False)
 83 |     aug = imgaug.AugmentorList(augmentors)
 84 | 
 85 |     def mapf(dp):
 86 |         fname, cls = dp
 87 |         im = cv2.imread(fname, cv2.IMREAD_COLOR)
 88 |         im = aug.augment(im)
 89 |         return im, cls
 90 |     ds = MultiThreadMapData(ds, parallel, mapf,
 91 |                             buffer_size=min(2000, ds.size()), strict=True)
 92 |     ds = BatchData(ds, batch_size, remainder=True)
 93 |     # do not fork() under MPI
 94 |     return ds
 95 | 
 96 | 
 97 | def eval_on_ILSVRC12(model, sessinit, dataflow):
 98 |     pred_config = PredictConfig(
 99 |         model=model,
100 |         session_init=sessinit,
101 |         input_names=['input', 'label'],
102 |         output_names=['wrong-top1', 'wrong-top5', 'attack_success']
103 |     )
104 |     pred = SimpleDatasetPredictor(pred_config, dataflow)
105 |     acc1, acc5, succ = RatioCounter(), RatioCounter(), RatioCounter()
106 |     for top1, top5, num_succ in pred.get_result():
107 |         batch_size = top1.shape[0]
108 |         acc1.feed(top1.sum(), batch_size)
109 |         acc5.feed(top5.sum(), batch_size)
110 |         succ.feed(num_succ.sum(), batch_size)
111 |         # Uncomment to monitor the metrics during evaluation
112 |         # print("Top1 Error: {}".format(acc1.ratio))
113 |         # print("Attack Success Rate: {}".format(succ.ratio))
114 |     print("Top1 Error: {}".format(acc1.ratio))
115 |     print("Attack Success Rate: {}".format(succ.ratio))
116 |     print("Top5 Error: {}".format(acc5.ratio))
117 | 
118 | 
119 | class ImageNetModel(ModelDesc):
120 |     image_shape = 224
121 | 
122 |     """
123 |     uint8 instead of float32 is used as input type to reduce copy overhead.
124 |     It might hurt the performance a liiiitle bit.
125 |     """
126 |     image_dtype = tf.uint8
127 | 
128 |     """
129 |     Either 'NCHW' or 'NHWC'
130 |     """
131 |     data_format = 'NCHW'
132 | 
133 |     """
134 |     Whether the image is BGR or RGB. If using DataFlow, then it should be BGR.
135 |     """
136 |     image_bgr = True
137 | 
138 |     weight_decay = 1e-4
139 | 
140 |     """
141 |     To apply on normalization parameters, use '.*/W|.*/gamma|.*/beta'
142 |     """
143 |     weight_decay_pattern = '.*/W'
144 | 
145 |     """
146 |     Scale the loss, for whatever reasons (e.g., gradient averaging, fp16 training, etc)
147 |     """
148 |     loss_scale = 1.
149 | 
150 |     """
151 |     Label smoothing (See tf.losses.softmax_cross_entropy)
152 |     """
153 |     label_smoothing = 0.
154 | 
155 |     def inputs(self):
156 |         return [tf.placeholder(self.image_dtype, [None, self.image_shape, self.image_shape, 3], 'input'),
157 |                 tf.placeholder(tf.int32, [None], 'label')]
158 | 
159 |     def build_graph(self, image, label):
160 |         image = self.image_preprocess(image)
161 |         assert self.data_format == 'NCHW'
162 |         image = tf.transpose(image, [0, 3, 1, 2])
163 | 
164 |         logits = self.get_logits(image)
165 |         loss = ImageNetModel.compute_loss_and_error(
166 |             logits, label, label_smoothing=self.label_smoothing)
167 | 
168 |         if self.weight_decay > 0:
169 |             wd_loss = regularize_cost(self.weight_decay_pattern,
170 |                                       tf.contrib.layers.l2_regularizer(self.weight_decay),
171 |                                       name='l2_regularize_loss')
172 |             add_moving_summary(loss, wd_loss)
173 |             total_cost = tf.add_n([loss, wd_loss], name='cost')
174 |         else:
175 |             total_cost = tf.identity(loss, name='cost')
176 |             add_moving_summary(total_cost)
177 | 
178 |         if self.loss_scale != 1.:
179 |             logger.info("Scaling the total loss by {} ...".format(self.loss_scale))
180 |             return total_cost * self.loss_scale
181 |         else:
182 |             return total_cost
183 | 
184 |     @abstractmethod
185 |     def get_logits(self, image):
186 |         """
187 |         Args:
188 |             image: 4D tensor of ``self.input_shape`` in ``self.data_format``
189 | 
190 |         Returns:
191 |             Nx#class logits
192 |         """
193 | 
194 |     def optimizer(self):
195 |         lr = tf.get_variable('learning_rate', initializer=0.1, trainable=False)
196 |         tf.summary.scalar('learning_rate-summary', lr)
197 |         return tf.train.MomentumOptimizer(lr, 0.9, use_nesterov=True)
198 | 
199 |     def image_preprocess(self, image):
200 |         with tf.name_scope('image_preprocess'):
201 |             if image.dtype.base_dtype != tf.float32:
202 |                 image = tf.cast(image, tf.float32)
203 |             mean = [0.485, 0.456, 0.406]    # rgb
204 |             std = [0.229, 0.224, 0.225]
205 |             if self.image_bgr:
206 |                 mean = mean[::-1]
207 |                 std = std[::-1]
208 |             image_mean = tf.constant(mean, dtype=tf.float32) * 255.
209 |             image_std = tf.constant(std, dtype=tf.float32) * 255.
210 |             image = (image - image_mean) / image_std
211 |             return image
212 | 
213 |     @staticmethod
214 |     def compute_loss_and_error(logits, label, label_smoothing=0.):
215 |         if label_smoothing == 0.:
216 |             loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=label)
217 |         else:
218 |             nclass = logits.shape[-1]
219 |             loss = tf.losses.softmax_cross_entropy(
220 |                 tf.one_hot(label, nclass),
221 |                 logits, label_smoothing=label_smoothing,
222 |                 reduction=tf.losses.Reduction.NONE)
223 |         loss = tf.reduce_mean(loss, name='xentropy-loss')
224 | 
225 |         def prediction_incorrect(logits, label, topk=1, name='incorrect_vector'):
226 |             with tf.name_scope('prediction_incorrect'):
227 |                 x = tf.logical_not(tf.nn.in_top_k(logits, label, topk))
228 |             return tf.cast(x, tf.float32, name=name)
229 | 
230 |         wrong = prediction_incorrect(logits, label, 1, name='wrong-top1')
231 |         add_moving_summary(tf.reduce_mean(wrong, name='train-error-top1'))
232 | 
233 |         wrong = prediction_incorrect(logits, label, 5, name='wrong-top5')
234 |         add_moving_summary(tf.reduce_mean(wrong, name='train-error-top5'))
235 |         return loss
236 | 


--------------------------------------------------------------------------------
/third_party/serve-data.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | # -*- coding: utf-8 -*-
 3 | # File: serve-data.py
 4 | 
 5 | import argparse
 6 | import os
 7 | import multiprocessing as mp
 8 | import socket
 9 | 
10 | from tensorpack.dataflow import (
11 |     send_dataflow_zmq, MapData, TestDataSpeed, FakeData, dataset,
12 |     AugmentImageComponent, BatchData, PrefetchDataZMQ)
13 | from tensorpack.utils import logger
14 | from imagenet_utils import fbresnet_augmentor
15 | 
16 | 
17 | def get_data(batch, augmentors):
18 |     """
19 |     Sec 3, Remark 4:
20 |     Use a single random shuffling of the training data (per epoch) that is divided amongst all k workers.
21 | 
22 |     NOTE: Here we do not follow the paper, but it makes little differences.
23 |     """
24 |     ds = dataset.ILSVRC12(args.data, 'train', shuffle=True)
25 |     ds = AugmentImageComponent(ds, augmentors, copy=False)
26 |     ds = BatchData(ds, batch, remainder=False)
27 |     ds = PrefetchDataZMQ(ds, min(50, mp.cpu_count()))
28 |     return ds
29 | 
30 | 
31 | if __name__ == '__main__':
32 |     parser = argparse.ArgumentParser()
33 |     parser.add_argument('--data', help='ILSVRC dataset dir')
34 |     parser.add_argument('--fake', action='store_true')
35 |     parser.add_argument('--batch', help='per-GPU batch size',
36 |                         default=32, type=int)
37 |     parser.add_argument('--benchmark', action='store_true')
38 |     parser.add_argument('--no-zmq-ops', action='store_true')
39 |     args = parser.parse_args()
40 | 
41 |     os.environ['CUDA_VISIBLE_DEVICES'] = ''
42 | 
43 |     if args.fake:
44 |         ds = FakeData(
45 |             [[args.batch, 224, 224, 3], [args.batch]],
46 |             1000, random=False, dtype=['uint8', 'int32'])
47 |     else:
48 |         augs = fbresnet_augmentor(True)
49 |         ds = get_data(args.batch, augs)
50 | 
51 |     logger.info("Serving data on {}".format(socket.gethostname()))
52 | 
53 |     if args.benchmark:
54 |         from zmq_ops import dump_arrays
55 |         ds = MapData(ds, dump_arrays)
56 |         TestDataSpeed(ds, warmup=300).start()
57 |     else:
58 |         format = None if args.no_zmq_ops else 'zmq_ops'
59 |         send_dataflow_zmq(
60 |             ds, 'ipc://@imagenet-train-b{}'.format(args.batch),
61 |             hwm=150, format=format, bind=True)
62 | 


--------------------------------------------------------------------------------
/third_party/utils.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import horovod.tensorflow as hvd
 3 | import tensorflow as tf
 4 | 
 5 | from tensorpack.callbacks import Inferencer
 6 | from tensorpack.utils.stats import RatioCounter
 7 | 
 8 | 
 9 | class HorovodClassificationError(Inferencer):
10 |     """
11 |     Like ClassificationError, it evaluates total samples & count of incorrect or correct samples.
12 |     But in the end we aggregate the total&count by horovod.
13 |     """
14 |     def __init__(self, wrong_tensor_name, summary_name='validation_error'):
15 |         """
16 |         Args:
17 |             wrong_tensor_name(str): name of the ``wrong`` binary vector tensor.
18 |             summary_name(str): the name to log the error with.
19 |         """
20 |         self.wrong_tensor_name = wrong_tensor_name
21 |         self.summary_name = summary_name
22 | 
23 |     def _setup_graph(self):
24 |         self._placeholder = tf.placeholder(tf.float32, shape=[2], name='to_be_reduced')
25 |         self._reduced = hvd.allreduce(self._placeholder, average=False)
26 | 
27 |     def _before_inference(self):
28 |         self.err_stat = RatioCounter()
29 | 
30 |     def _get_fetches(self):
31 |         return [self.wrong_tensor_name]
32 | 
33 |     def _on_fetches(self, outputs):
34 |         vec = outputs[0]
35 |         batch_size = len(vec)
36 |         wrong = np.sum(vec)
37 |         self.err_stat.feed(wrong, batch_size)
38 |         # Uncomment this to monitor the metric during evaluation
39 |         # print(self.summary_name, self.err_stat.ratio)
40 | 
41 |     def _after_inference(self):
42 |         tot = self.err_stat.total
43 |         cnt = self.err_stat.count
44 |         tot, cnt = self._reduced.eval(feed_dict={self._placeholder: [tot, cnt]})
45 |         return {self.summary_name: cnt * 1. / tot}
46 | 


--------------------------------------------------------------------------------
/tox.ini:
--------------------------------------------------------------------------------
 1 | # Copyright (c) Facebook, Inc. and its affiliates.
 2 | # All rights reserved.
 3 | #
 4 | # This source code is licensed under the license found in the
 5 | # LICENSE file in the root directory of this source tree.
 6 | 
 7 | [flake8]
 8 | max-line-length = 120
 9 | ignore = F403,F405,E402,E741,E742,E743
10 | 


--------------------------------------------------------------------------------