├── .github └── ISSUE_TEMPLATE.md ├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── INSTRUCTIONS.md ├── LICENSE ├── README.md ├── adv_model.py ├── inference-example.py ├── main.py ├── nets.py ├── resnet_model.py ├── slurm ├── eval.sh └── train.sh ├── teaser.jpg ├── third_party ├── README.md ├── __init__.py ├── imagenet_utils.py ├── serve-data.py └── utils.py └── tox.ini /.github/ISSUE_TEMPLATE.md: -------------------------------------------------------------------------------- 1 | If you met an unexpected problem when using the code, please include the following in your issues: 2 | 3 | 1. What you did: the command you run. 4 | 5 | 2. What you observed: the full logs and other relevant information. 6 | 7 | 3. What you expected, if not obvious. 8 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | train_log 2 | train_log_* 3 | logs 4 | *.npy 5 | *.npz 6 | *.caffemodel 7 | *.tfmodel 8 | *.meta 9 | *.log* 10 | *.bin 11 | *.png 12 | *.jpg 13 | checkpoint 14 | *.json 15 | *.prototxt 16 | *.txt 17 | *.tgz 18 | *.gz 19 | 20 | 21 | # Byte-compiled / optimized / DLL files 22 | __pycache__/ 23 | *.py[cod] 24 | 25 | # C extensions 26 | *.so 27 | 28 | # Distribution / packaging 29 | .Python 30 | env/ 31 | build/ 32 | develop-eggs/ 33 | dist/ 34 | downloads/ 35 | eggs/ 36 | .eggs/ 37 | lib/ 38 | lib64/ 39 | parts/ 40 | sdist/ 41 | var/ 42 | *.egg-info/ 43 | .installed.cfg 44 | *.egg 45 | 46 | # PyInstaller 47 | # Usually these files are written by a python script from a template 48 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 49 | *.manifest 50 | *.spec 51 | 52 | # Installer logs 53 | pip-log.txt 54 | pip-delete-this-directory.txt 55 | 56 | # Unit test / coverage reports 57 | htmlcov/ 58 | .tox/ 59 | .coverage 60 | .coverage.* 61 | .cache 62 | nosetests.xml 63 | coverage.xml 64 | *,cover 65 | 66 | # Translations 67 | *.mo 68 | *.pot 69 | 70 | # Django stuff: 71 | *.log 72 | 73 | # Sphinx documentation 74 | docs/_build/ 75 | 76 | # PyBuilder 77 | target/ 78 | *.dat 79 | 80 | .idea/ 81 | *.diff 82 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | Facebook has adopted a Code of Conduct that we expect project participants to adhere to. 4 | Please read the [full text](https://code.fb.com/codeofconduct/) 5 | so that you can understand what actions will and will not be tolerated. 6 | 7 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | We want to make contributing to this project as easy and transparent as 4 | possible. 5 | 6 | ## Our Development Process 7 | This code is released for the purpose of reproducing research. 8 | We don't expect frequent developement on this project unless we have new research results to share. 9 | Minor changes and improvements will be released on an ongoing basis. Larger 10 | changes (e.g., changesets implementing a new paper) will be released on a more 11 | periodic basis. 12 | 13 | ## Contributor License Agreement ("CLA") 14 | In order to accept your pull request, we need you to submit a CLA. You only need 15 | to do this once to work on any of Facebook's open source projects. 16 | 17 | Complete your CLA here: 18 | 19 | ## Issues 20 | 21 | We welcome the use of github issues for any questions you might have about the code. 22 | 23 | ## Coding Style 24 | 25 | We mainly follow PEP8 style. 26 | 27 | ## License 28 | By contributing to this project, you agree that your contributions will be licensed 29 | under the LICENSE file in the root directory of this source tree. 30 | 31 | -------------------------------------------------------------------------------- /INSTRUCTIONS.md: -------------------------------------------------------------------------------- 1 | 2 | ## Dependencies: 3 | 4 | + TensorFlow ≥ 1.6 with GPU support 5 | + Tensorpack ≥ 0.9.8 6 | + OpenCV ≥ 3 7 | + horovod ≥ 0.15 with NCCL support 8 | + horovod has many [installation options](https://github.com/uber/horovod/blob/master/docs/gpus.md) to optimize its multi-machine/multi-GPU performance. 9 | You might want to follow them. 10 | + ImageNet data in its standard directory structure. 11 | + TensorFlow [zmq_ops](https://github.com/tensorpack/zmq_ops) (needed only for training with real data) 12 | 13 | 14 | ## Model Zoo: 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 54 | 55 | 56 | 57 | 58 | 59 | 60 |
Model (click for details)error rate (%)error rate / attack success rate (%)
clean images10-step PGD100-step PGD1000-step PGD
ResNet152 Baseline --arch ResNet -d 152 34 | :arrow_down:
37.747.5/5.558.3/31.061.0/36.1
ResNet152 Denoise --arch ResNetDenoise -d 152 43 | :arrow_down:
34.744.3/4.954.5/26.657.2/32.7
ResNeXt101 DenoiseAll --arch ResNeXtDenoiseAll
-d 101 52 | :arrow_down: 53 |
31.644.0/4.955.6/31.559.6/38.1
61 | 62 | 63 | Click the first column to download the model and obtain the flags to be used with the script. 64 | 65 | Note: 66 | 67 | 1. As mentioned in the paper, the threat model is: 68 | 69 | 1. __Targeted attack__, with one target label associated with each image. The target label is 70 | independently generated by uniformly sampling the incorrect labels. 71 | 2. Maximum perturbation per pixel is 16. 72 | 73 | We do not consider untargeted attacks, nor do we let the attacker control the target labels, 74 | because we think such tasks are not realistic on the ImageNet-1k categories. 75 | 76 | 2. For each (attacker, model) pair, we provide both the __error rate__ of our model, 77 | and the __attack success rate__ of the attacker, on ImageNet validation set. 78 | A targeted attack is considered successful if the image is classified to the target label. 79 | 80 | __For attackers__, if you develop a new targeted attack method against our models, 81 | *please compare its attack success rate* with PGD. 82 | Error rate / accuracy is not a reasonable metric, because then the method can cheat by becoming 83 | close to untargeted attacks. 84 | 85 | __For defenders__, if you develop a new robust model, please compare its accuracy with our models. 86 | Attack success rate is not a reasonable metric, because then the model can cheat by making random predictions. 87 | 88 | 3. `ResNeXt101 DenoiseAll` is the submission that won the champion of 89 | black-box defense track in [Competition on Adversarial Attacks and Defenses 2018](http://hof.geekpwn.org/caad/en/index.html). 90 | This model was trained with slightly different training settings 91 | therefore its results are not directly comparable with other models. 92 | 93 | 94 | ## Evaluate White-Box Robustness: 95 | 96 | To evaluate on one GPU, run this command: 97 | ``` 98 | python main.py --eval --load /path/to/model_checkpoint --data /path/to/imagenet \ 99 | --attack-iter [INTEGER] --attack-epsilon 16.0 [--architecture-flags] 100 | ``` 101 | 102 | To reproduce our evaluation results, 103 | take "architecture flags" from the first column in the model zoo, and set the attack iteration. 104 | Iteration can be set to 0 to evaluate its clean image error rate. 105 | Note that the evaluation result may have a ±0.3 fluctuation due to the 106 | randomly-chosen target attack label and attack initialization. 107 | 108 | Using a K-step attacker makes the evaluation K-times slower. 109 | To speed up evaluation, run it under MPI with multi-GPU or multiple machines, e.g.: 110 | 111 | ``` 112 | mpirun -np 8 python main.py --eval --load /path/to/model_checkpoint --data /path/to/imagenet \ 113 | --attack-iter [INTEGER] --attack-epsilon 16.0 [--architecture-flags] 114 | ``` 115 | 116 | Evaluating the `Res152 Denoise` model against 100-step PGD attackers takes about 1 hour with 16 V100s. 117 | 118 | 119 | ## Evaluate Black-Box Robustness: 120 | 121 | We provide a command line option to produce predictions for an image directory, e.g.: 122 | ``` 123 | python main.py --eval-directory /path/to/image/directory --prediction-file predictions.txt \ 124 | --load X101-DenseDenoise.npz -d 101 --arch ResNeXtDenoiseAll --batch 20 125 | ``` 126 | 127 | This will produce a file "predictions.txt" which contains the filename and 128 | predicted label for each image found in the directory. 129 | You can use this to evaluate its black-box robustness. 130 | 131 | Our CAAD2018 submission is equivalent to the above command. 132 | 133 | ## Train: 134 | 135 | Our code can be used for both standard ImageNet training (with `--attack-iter 0`) and adversarial training. 136 | Adversarial training takes a long time and we recommend doing it only when you have a lot of GPUs. 137 | 138 | To train, first start one data serving process __on each machine__: 139 | ``` 140 | $ ./third_party/serve-data.py --data /path/to/imagenet/ --batch 32 141 | ``` 142 | 143 | Then, launch a distributed job with MPI. You may need to consult your cluster 144 | administrator for the MPI command line arguments you should use. 145 | On a cluster with InfiniBand, it may look like this: 146 | 147 | ``` 148 | mpirun -np 16 -H host1:8,host2:8 --output-filename train.log \ 149 | -bind-to none -map-by slot -mca pml ob1 \ 150 | -x NCCL_IB_CUDA_SUPPORT=1 -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO \ 151 | -x PATH -x PYTHONPATH -x LD_LIBRARY_PATH \ 152 | python main.py --data /path/to/imagenet \ 153 | --batch 32 --attack-iter [INTEGER] --attack-epsilon 16.0 [--architecture-flags] 154 | ``` 155 | 156 | If your cluster is managed by slurm , we provide some sample [slurm job scripts](slurm/) 157 | for your reference. 158 | 159 | The training code will also perform distributed evaluation of white-box robustness. 160 | 161 | ### Training Speed: 162 | 163 | With 30 attack iterations during training, 164 | the `Res152 Baseline` model takes about 52 hours to finish training on 128 V100s. 165 | 166 | Under the same setting, the `Res152 Denoise` model takes about 90 hours on 128 V100s. 167 | Note that the model actually does not add much computation to the baseline, 168 | but it lacks efficient GPU implementation for the softmax version of non-local operation. 169 | The dot-product version, on the other hand, is much faster. 170 | 171 | If you use CUDA≥9.2, TF≥1.12 on Volta GPUs, the flag `--use-fp16xla` will enable XLA-optimized 172 | FP16 PGD attack, which reduces training time about 2x, with a drop of about 3% robustness. 173 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Attribution-NonCommercial 4.0 International 2 | 3 | ======================================================================= 4 | 5 | Creative Commons Corporation ("Creative Commons") is not a law firm and 6 | does not provide legal services or legal advice. Distribution of 7 | Creative Commons public licenses does not create a lawyer-client or 8 | other relationship. Creative Commons makes its licenses and related 9 | information available on an "as-is" basis. Creative Commons gives no 10 | warranties regarding its licenses, any material licensed under their 11 | terms and conditions, or any related information. Creative Commons 12 | disclaims all liability for damages resulting from their use to the 13 | fullest extent possible. 14 | 15 | Using Creative Commons Public Licenses 16 | 17 | Creative Commons public licenses provide a standard set of terms and 18 | conditions that creators and other rights holders may use to share 19 | original works of authorship and other material subject to copyright 20 | and certain other rights specified in the public license below. The 21 | following considerations are for informational purposes only, are not 22 | exhaustive, and do not form part of our licenses. 23 | 24 | Considerations for licensors: Our public licenses are 25 | intended for use by those authorized to give the public 26 | permission to use material in ways otherwise restricted by 27 | copyright and certain other rights. Our licenses are 28 | irrevocable. Licensors should read and understand the terms 29 | and conditions of the license they choose before applying it. 30 | Licensors should also secure all rights necessary before 31 | applying our licenses so that the public can reuse the 32 | material as expected. Licensors should clearly mark any 33 | material not subject to the license. This includes other CC- 34 | licensed material, or material used under an exception or 35 | limitation to copyright. More considerations for licensors: 36 | wiki.creativecommons.org/Considerations_for_licensors 37 | 38 | Considerations for the public: By using one of our public 39 | licenses, a licensor grants the public permission to use the 40 | licensed material under specified terms and conditions. If 41 | the licensor's permission is not necessary for any reason--for 42 | example, because of any applicable exception or limitation to 43 | copyright--then that use is not regulated by the license. Our 44 | licenses grant only permissions under copyright and certain 45 | other rights that a licensor has authority to grant. Use of 46 | the licensed material may still be restricted for other 47 | reasons, including because others have copyright or other 48 | rights in the material. A licensor may make special requests, 49 | such as asking that all changes be marked or described. 50 | Although not required by our licenses, you are encouraged to 51 | respect those requests where reasonable. More_considerations 52 | for the public: 53 | wiki.creativecommons.org/Considerations_for_licensees 54 | 55 | ======================================================================= 56 | 57 | Creative Commons Attribution-NonCommercial 4.0 International Public 58 | License 59 | 60 | By exercising the Licensed Rights (defined below), You accept and agree 61 | to be bound by the terms and conditions of this Creative Commons 62 | Attribution-NonCommercial 4.0 International Public License ("Public 63 | License"). To the extent this Public License may be interpreted as a 64 | contract, You are granted the Licensed Rights in consideration of Your 65 | acceptance of these terms and conditions, and the Licensor grants You 66 | such rights in consideration of benefits the Licensor receives from 67 | making the Licensed Material available under these terms and 68 | conditions. 69 | 70 | Section 1 -- Definitions. 71 | 72 | a. Adapted Material means material subject to Copyright and Similar 73 | Rights that is derived from or based upon the Licensed Material 74 | and in which the Licensed Material is translated, altered, 75 | arranged, transformed, or otherwise modified in a manner requiring 76 | permission under the Copyright and Similar Rights held by the 77 | Licensor. For purposes of this Public License, where the Licensed 78 | Material is a musical work, performance, or sound recording, 79 | Adapted Material is always produced where the Licensed Material is 80 | synched in timed relation with a moving image. 81 | 82 | b. Adapter's License means the license You apply to Your Copyright 83 | and Similar Rights in Your contributions to Adapted Material in 84 | accordance with the terms and conditions of this Public License. 85 | 86 | c. Copyright and Similar Rights means copyright and/or similar rights 87 | closely related to copyright including, without limitation, 88 | performance, broadcast, sound recording, and Sui Generis Database 89 | Rights, without regard to how the rights are labeled or 90 | categorized. For purposes of this Public License, the rights 91 | specified in Section 2(b)(1)-(2) are not Copyright and Similar 92 | Rights. 93 | d. Effective Technological Measures means those measures that, in the 94 | absence of proper authority, may not be circumvented under laws 95 | fulfilling obligations under Article 11 of the WIPO Copyright 96 | Treaty adopted on December 20, 1996, and/or similar international 97 | agreements. 98 | 99 | e. Exceptions and Limitations means fair use, fair dealing, and/or 100 | any other exception or limitation to Copyright and Similar Rights 101 | that applies to Your use of the Licensed Material. 102 | 103 | f. Licensed Material means the artistic or literary work, database, 104 | or other material to which the Licensor applied this Public 105 | License. 106 | 107 | g. Licensed Rights means the rights granted to You subject to the 108 | terms and conditions of this Public License, which are limited to 109 | all Copyright and Similar Rights that apply to Your use of the 110 | Licensed Material and that the Licensor has authority to license. 111 | 112 | h. Licensor means the individual(s) or entity(ies) granting rights 113 | under this Public License. 114 | 115 | i. NonCommercial means not primarily intended for or directed towards 116 | commercial advantage or monetary compensation. For purposes of 117 | this Public License, the exchange of the Licensed Material for 118 | other material subject to Copyright and Similar Rights by digital 119 | file-sharing or similar means is NonCommercial provided there is 120 | no payment of monetary compensation in connection with the 121 | exchange. 122 | 123 | j. Share means to provide material to the public by any means or 124 | process that requires permission under the Licensed Rights, such 125 | as reproduction, public display, public performance, distribution, 126 | dissemination, communication, or importation, and to make material 127 | available to the public including in ways that members of the 128 | public may access the material from a place and at a time 129 | individually chosen by them. 130 | 131 | k. Sui Generis Database Rights means rights other than copyright 132 | resulting from Directive 96/9/EC of the European Parliament and of 133 | the Council of 11 March 1996 on the legal protection of databases, 134 | as amended and/or succeeded, as well as other essentially 135 | equivalent rights anywhere in the world. 136 | 137 | l. You means the individual or entity exercising the Licensed Rights 138 | under this Public License. Your has a corresponding meaning. 139 | 140 | Section 2 -- Scope. 141 | 142 | a. License grant. 143 | 144 | 1. Subject to the terms and conditions of this Public License, 145 | the Licensor hereby grants You a worldwide, royalty-free, 146 | non-sublicensable, non-exclusive, irrevocable license to 147 | exercise the Licensed Rights in the Licensed Material to: 148 | 149 | a. reproduce and Share the Licensed Material, in whole or 150 | in part, for NonCommercial purposes only; and 151 | 152 | b. produce, reproduce, and Share Adapted Material for 153 | NonCommercial purposes only. 154 | 155 | 2. Exceptions and Limitations. For the avoidance of doubt, where 156 | Exceptions and Limitations apply to Your use, this Public 157 | License does not apply, and You do not need to comply with 158 | its terms and conditions. 159 | 160 | 3. Term. The term of this Public License is specified in Section 161 | 6(a). 162 | 163 | 4. Media and formats; technical modifications allowed. The 164 | Licensor authorizes You to exercise the Licensed Rights in 165 | all media and formats whether now known or hereafter created, 166 | and to make technical modifications necessary to do so. The 167 | Licensor waives and/or agrees not to assert any right or 168 | authority to forbid You from making technical modifications 169 | necessary to exercise the Licensed Rights, including 170 | technical modifications necessary to circumvent Effective 171 | Technological Measures. For purposes of this Public License, 172 | simply making modifications authorized by this Section 2(a) 173 | (4) never produces Adapted Material. 174 | 175 | 5. Downstream recipients. 176 | 177 | a. Offer from the Licensor -- Licensed Material. Every 178 | recipient of the Licensed Material automatically 179 | receives an offer from the Licensor to exercise the 180 | Licensed Rights under the terms and conditions of this 181 | Public License. 182 | 183 | b. No downstream restrictions. You may not offer or impose 184 | any additional or different terms or conditions on, or 185 | apply any Effective Technological Measures to, the 186 | Licensed Material if doing so restricts exercise of the 187 | Licensed Rights by any recipient of the Licensed 188 | Material. 189 | 190 | 6. No endorsement. Nothing in this Public License constitutes or 191 | may be construed as permission to assert or imply that You 192 | are, or that Your use of the Licensed Material is, connected 193 | with, or sponsored, endorsed, or granted official status by, 194 | the Licensor or others designated to receive attribution as 195 | provided in Section 3(a)(1)(A)(i). 196 | 197 | b. Other rights. 198 | 199 | 1. Moral rights, such as the right of integrity, are not 200 | licensed under this Public License, nor are publicity, 201 | privacy, and/or other similar personality rights; however, to 202 | the extent possible, the Licensor waives and/or agrees not to 203 | assert any such rights held by the Licensor to the limited 204 | extent necessary to allow You to exercise the Licensed 205 | Rights, but not otherwise. 206 | 207 | 2. Patent and trademark rights are not licensed under this 208 | Public License. 209 | 210 | 3. To the extent possible, the Licensor waives any right to 211 | collect royalties from You for the exercise of the Licensed 212 | Rights, whether directly or through a collecting society 213 | under any voluntary or waivable statutory or compulsory 214 | licensing scheme. In all other cases the Licensor expressly 215 | reserves any right to collect such royalties, including when 216 | the Licensed Material is used other than for NonCommercial 217 | purposes. 218 | 219 | Section 3 -- License Conditions. 220 | 221 | Your exercise of the Licensed Rights is expressly made subject to the 222 | following conditions. 223 | 224 | a. Attribution. 225 | 226 | 1. If You Share the Licensed Material (including in modified 227 | form), You must: 228 | 229 | a. retain the following if it is supplied by the Licensor 230 | with the Licensed Material: 231 | 232 | i. identification of the creator(s) of the Licensed 233 | Material and any others designated to receive 234 | attribution, in any reasonable manner requested by 235 | the Licensor (including by pseudonym if 236 | designated); 237 | 238 | ii. a copyright notice; 239 | 240 | iii. a notice that refers to this Public License; 241 | 242 | iv. a notice that refers to the disclaimer of 243 | warranties; 244 | 245 | v. a URI or hyperlink to the Licensed Material to the 246 | extent reasonably practicable; 247 | 248 | b. indicate if You modified the Licensed Material and 249 | retain an indication of any previous modifications; and 250 | 251 | c. indicate the Licensed Material is licensed under this 252 | Public License, and include the text of, or the URI or 253 | hyperlink to, this Public License. 254 | 255 | 2. You may satisfy the conditions in Section 3(a)(1) in any 256 | reasonable manner based on the medium, means, and context in 257 | which You Share the Licensed Material. For example, it may be 258 | reasonable to satisfy the conditions by providing a URI or 259 | hyperlink to a resource that includes the required 260 | information. 261 | 262 | 3. If requested by the Licensor, You must remove any of the 263 | information required by Section 3(a)(1)(A) to the extent 264 | reasonably practicable. 265 | 266 | 4. If You Share Adapted Material You produce, the Adapter's 267 | License You apply must not prevent recipients of the Adapted 268 | Material from complying with this Public License. 269 | 270 | Section 4 -- Sui Generis Database Rights. 271 | 272 | Where the Licensed Rights include Sui Generis Database Rights that 273 | apply to Your use of the Licensed Material: 274 | 275 | a. for the avoidance of doubt, Section 2(a)(1) grants You the right 276 | to extract, reuse, reproduce, and Share all or a substantial 277 | portion of the contents of the database for NonCommercial purposes 278 | only; 279 | 280 | b. if You include all or a substantial portion of the database 281 | contents in a database in which You have Sui Generis Database 282 | Rights, then the database in which You have Sui Generis Database 283 | Rights (but not its individual contents) is Adapted Material; and 284 | 285 | c. You must comply with the conditions in Section 3(a) if You Share 286 | all or a substantial portion of the contents of the database. 287 | 288 | For the avoidance of doubt, this Section 4 supplements and does not 289 | replace Your obligations under this Public License where the Licensed 290 | Rights include other Copyright and Similar Rights. 291 | 292 | Section 5 -- Disclaimer of Warranties and Limitation of Liability. 293 | 294 | a. UNLESS OTHERWISE SEPARATELY UNDERTAKEN BY THE LICENSOR, TO THE 295 | EXTENT POSSIBLE, THE LICENSOR OFFERS THE LICENSED MATERIAL AS-IS 296 | AND AS-AVAILABLE, AND MAKES NO REPRESENTATIONS OR WARRANTIES OF 297 | ANY KIND CONCERNING THE LICENSED MATERIAL, WHETHER EXPRESS, 298 | IMPLIED, STATUTORY, OR OTHER. THIS INCLUDES, WITHOUT LIMITATION, 299 | WARRANTIES OF TITLE, MERCHANTABILITY, FITNESS FOR A PARTICULAR 300 | PURPOSE, NON-INFRINGEMENT, ABSENCE OF LATENT OR OTHER DEFECTS, 301 | ACCURACY, OR THE PRESENCE OR ABSENCE OF ERRORS, WHETHER OR NOT 302 | KNOWN OR DISCOVERABLE. WHERE DISCLAIMERS OF WARRANTIES ARE NOT 303 | ALLOWED IN FULL OR IN PART, THIS DISCLAIMER MAY NOT APPLY TO YOU. 304 | 305 | b. TO THE EXTENT POSSIBLE, IN NO EVENT WILL THE LICENSOR BE LIABLE 306 | TO YOU ON ANY LEGAL THEORY (INCLUDING, WITHOUT LIMITATION, 307 | NEGLIGENCE) OR OTHERWISE FOR ANY DIRECT, SPECIAL, INDIRECT, 308 | INCIDENTAL, CONSEQUENTIAL, PUNITIVE, EXEMPLARY, OR OTHER LOSSES, 309 | COSTS, EXPENSES, OR DAMAGES ARISING OUT OF THIS PUBLIC LICENSE OR 310 | USE OF THE LICENSED MATERIAL, EVEN IF THE LICENSOR HAS BEEN 311 | ADVISED OF THE POSSIBILITY OF SUCH LOSSES, COSTS, EXPENSES, OR 312 | DAMAGES. WHERE A LIMITATION OF LIABILITY IS NOT ALLOWED IN FULL OR 313 | IN PART, THIS LIMITATION MAY NOT APPLY TO YOU. 314 | 315 | c. The disclaimer of warranties and limitation of liability provided 316 | above shall be interpreted in a manner that, to the extent 317 | possible, most closely approximates an absolute disclaimer and 318 | waiver of all liability. 319 | 320 | Section 6 -- Term and Termination. 321 | 322 | a. This Public License applies for the term of the Copyright and 323 | Similar Rights licensed here. However, if You fail to comply with 324 | this Public License, then Your rights under this Public License 325 | terminate automatically. 326 | 327 | b. Where Your right to use the Licensed Material has terminated under 328 | Section 6(a), it reinstates: 329 | 330 | 1. automatically as of the date the violation is cured, provided 331 | it is cured within 30 days of Your discovery of the 332 | violation; or 333 | 334 | 2. upon express reinstatement by the Licensor. 335 | 336 | For the avoidance of doubt, this Section 6(b) does not affect any 337 | right the Licensor may have to seek remedies for Your violations 338 | of this Public License. 339 | 340 | c. For the avoidance of doubt, the Licensor may also offer the 341 | Licensed Material under separate terms or conditions or stop 342 | distributing the Licensed Material at any time; however, doing so 343 | will not terminate this Public License. 344 | 345 | d. Sections 1, 5, 6, 7, and 8 survive termination of this Public 346 | License. 347 | 348 | Section 7 -- Other Terms and Conditions. 349 | 350 | a. The Licensor shall not be bound by any additional or different 351 | terms or conditions communicated by You unless expressly agreed. 352 | 353 | b. Any arrangements, understandings, or agreements regarding the 354 | Licensed Material not stated herein are separate from and 355 | independent of the terms and conditions of this Public License. 356 | 357 | Section 8 -- Interpretation. 358 | 359 | a. For the avoidance of doubt, this Public License does not, and 360 | shall not be interpreted to, reduce, limit, restrict, or impose 361 | conditions on any use of the Licensed Material that could lawfully 362 | be made without permission under this Public License. 363 | 364 | b. To the extent possible, if any provision of this Public License is 365 | deemed unenforceable, it shall be automatically reformed to the 366 | minimum extent necessary to make it enforceable. If the provision 367 | cannot be reformed, it shall be severed from this Public License 368 | without affecting the enforceability of the remaining terms and 369 | conditions. 370 | 371 | c. No term or condition of this Public License will be waived and no 372 | failure to comply consented to unless expressly agreed to by the 373 | Licensor. 374 | 375 | d. Nothing in this Public License constitutes or may be interpreted 376 | as a limitation upon, or waiver of, any privileges and immunities 377 | that apply to the Licensor or You, including from the legal 378 | processes of any jurisdiction or authority. 379 | 380 | ======================================================================= 381 | 382 | Creative Commons is not a party to its public 383 | licenses. Notwithstanding, Creative Commons may elect to apply one of 384 | its public licenses to material it publishes and in those instances 385 | will be considered the “Licensor.” The text of the Creative Commons 386 | public licenses is dedicated to the public domain under the CC0 Public 387 | Domain Dedication. Except for the limited purpose of indicating that 388 | material is shared under a Creative Commons public license or as 389 | otherwise permitted by the Creative Commons policies published at 390 | creativecommons.org/policies, Creative Commons does not authorize the 391 | use of the trademark "Creative Commons" or any other trademark or logo 392 | of Creative Commons without its prior written consent including, 393 | without limitation, in connection with any unauthorized modifications 394 | to any of its public licenses or any other arrangements, 395 | understandings, or agreements concerning use of licensed material. For 396 | the avoidance of doubt, this paragraph does not form part of the 397 | public licenses. 398 | 399 | Creative Commons may be contacted at creativecommons.org. 400 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # Feature Denoising for Improving Adversarial Robustness 3 | 4 | Code and models for the paper [Feature Denoising for Improving Adversarial Robustness](https://arxiv.org/abs/1812.03411), CVPR2019. 5 | 6 | ## Introduction 7 | 8 |
9 | 10 |
11 | 12 | By combining large-scale adversarial training and feature-denoising layers, 13 | we developed ImageNet classifiers with strong adversarial robustness. 14 | 15 | Trained on __128 GPUs__, our ImageNet classifier has 42.6% accuracy against an extremely strong 16 | __2000-steps white-box__ PGD targeted attack. 17 | This is a scenario where no previous models have achieved more than 1% accuracy. 18 | 19 | On black-box adversarial defense, our method won the __champion of defense track__ in the 20 | [CAAD (Competition of Adversarial Attacks and Defenses) 2018](http://hof.geekpwn.org/caad/en/index.html). 21 | It also greatly outperforms the [CAAD 2017](https://www.kaggle.com/c/nips-2017-defense-against-adversarial-attack) defense track winner when evaluated 22 | against CAAD 2017 black-box attackers. 23 | 24 | This repo contains: 25 | 26 | 1. Our trained models, together with the evaluation script to verify their robustness. 27 | We welcome attackers to attack our released models and defenders to compare with our released models. 28 | 29 | 2. Our distributed adversarial training code on ImageNet. 30 | 31 | Please see [INSTRUCTIONS.md](INSTRUCTIONS.md) for the usage. 32 | 33 | ## License 34 | 35 | This project is under the CC-BY-NC 4.0 license. See [LICENSE](LICENSE) for details. 36 | 37 | ## Citation 38 | 39 | If you use our code, models or wish to refer to our results, please use the following BibTex entry: 40 | ``` 41 | @InProceedings{Xie_2019_CVPR, 42 | author = {Xie, Cihang and Wu, Yuxin and van der Maaten, Laurens and Yuille, Alan L. and He, Kaiming}, 43 | title = {Feature Denoising for Improving Adversarial Robustness}, 44 | booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 45 | month = {June}, 46 | year = {2019} 47 | } 48 | ``` 49 | -------------------------------------------------------------------------------- /adv_model.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | 7 | import tensorflow as tf 8 | 9 | from tensorpack.models import regularize_cost, BatchNorm 10 | from tensorpack.tfutils.summary import add_moving_summary 11 | from tensorpack.tfutils import argscope 12 | from tensorpack.tfutils.tower import TowerFunc 13 | from tensorpack.utils import logger 14 | from tensorpack.utils.argtools import log_once 15 | from tensorpack.tfutils.collection import freeze_collection 16 | from tensorpack.tfutils.varreplace import custom_getter_scope 17 | 18 | from third_party.imagenet_utils import ImageNetModel 19 | 20 | 21 | IMAGE_SCALE = 2.0 / 255 22 | 23 | 24 | class NoOpAttacker(): 25 | """ 26 | A placeholder attacker which does nothing. 27 | """ 28 | def attack(self, image, label, model_func): 29 | return image, -tf.ones_like(label) 30 | 31 | 32 | class PGDAttacker(): 33 | """ 34 | A PGD white-box attacker with random target label. 35 | """ 36 | 37 | USE_FP16 = False 38 | """ 39 | Use FP16 to run PGD iterations. 40 | This has about 2~3x speedup for most types of models 41 | if used together with XLA on Volta GPUs. 42 | """ 43 | 44 | USE_XLA = False 45 | """ 46 | Use XLA to optimize the graph of PGD iterations. 47 | This requires CUDA>=9.2 and TF>=1.12. 48 | """ 49 | 50 | def __init__(self, num_iter, epsilon, step_size, prob_start_from_clean=0.0): 51 | """ 52 | Args: 53 | num_iter (int): 54 | epsilon (float): 55 | step_size (int): 56 | prob_start_from_clean (float): The probability to initialize with 57 | the original image, rather than a randomly perturbed one. 58 | """ 59 | step_size = max(step_size, epsilon / num_iter) 60 | """ 61 | Feature Denoising, Sec 6.1: 62 | We set its step size α = 1, except for 10-iteration attacks where α is set to ε/10=1.6 63 | """ 64 | self.num_iter = num_iter 65 | # rescale the attack epsilon and attack step size 66 | self.epsilon = epsilon * IMAGE_SCALE 67 | self.step_size = step_size * IMAGE_SCALE 68 | self.prob_start_from_clean = prob_start_from_clean 69 | 70 | def _create_random_target(self, label): 71 | """ 72 | Feature Denoising Sec 6: 73 | we consider targeted attacks when 74 | evaluating under the white-box settings, where the targeted 75 | class is selected uniformly at random 76 | """ 77 | label_offset = tf.random_uniform(tf.shape(label), minval=1, maxval=1000, dtype=tf.int32) 78 | return tf.floormod(label + label_offset, tf.constant(1000, tf.int32)) 79 | 80 | def attack(self, image_clean, label, model_func): 81 | target_label = self._create_random_target(label) 82 | 83 | def fp16_getter(getter, *args, **kwargs): 84 | name = args[0] if len(args) else kwargs['name'] 85 | if not name.endswith('/W') and not name.endswith('/b'): 86 | """ 87 | Following convention, convolution & fc are quantized. 88 | BatchNorm (gamma & beta) are not quantized. 89 | """ 90 | return getter(*args, **kwargs) 91 | else: 92 | if kwargs['dtype'] == tf.float16: 93 | kwargs['dtype'] = tf.float32 94 | ret = getter(*args, **kwargs) 95 | ret = tf.cast(ret, tf.float16) 96 | log_once("Variable {} casted to fp16 ...".format(name)) 97 | return ret 98 | else: 99 | return getter(*args, **kwargs) 100 | 101 | def one_step_attack(adv): 102 | if not self.USE_FP16: 103 | logits = model_func(adv) 104 | else: 105 | adv16 = tf.cast(adv, tf.float16) 106 | with custom_getter_scope(fp16_getter): 107 | logits = model_func(adv16) 108 | logits = tf.cast(logits, tf.float32) 109 | # Note we don't add any summaries here when creating losses, because 110 | # summaries don't work in conditionals. 111 | losses = tf.nn.sparse_softmax_cross_entropy_with_logits( 112 | logits=logits, labels=target_label) # we want to minimize it in targeted attack 113 | if not self.USE_FP16: 114 | g, = tf.gradients(losses, adv) 115 | else: 116 | """ 117 | We perform loss scaling to prevent underflow: 118 | https://docs.nvidia.com/deeplearning/sdk/mixed-precision-training/index.html 119 | (We have not yet tried training without scaling) 120 | """ 121 | g, = tf.gradients(losses * 128., adv) 122 | g = g / 128. 123 | 124 | """ 125 | Feature Denoising, Sec 5: 126 | We use the Projected Gradient Descent (PGD) 127 | (implemented at https://github.com/MadryLab/cifar10_challenge ) 128 | as the white-box attacker for adversarial training 129 | """ 130 | adv = tf.clip_by_value(adv - tf.sign(g) * self.step_size, lower_bound, upper_bound) 131 | return adv 132 | 133 | """ 134 | Feature Denoising, Sec 6: 135 | Adversarial perturbation is considered under L∞ norm (i.e., maximum difference for each pixel). 136 | """ 137 | lower_bound = tf.clip_by_value(image_clean - self.epsilon, -1., 1.) 138 | upper_bound = tf.clip_by_value(image_clean + self.epsilon, -1., 1.) 139 | 140 | """ 141 | Feature Denoising Sec. 5: 142 | We randomly choose from both initializations in the 143 | PGD attacker during adversarial training: 20% of training 144 | batches use clean images to initialize PGD, and 80% use 145 | random points within the allowed . 146 | """ 147 | init_start = tf.random_uniform(tf.shape(image_clean), minval=-self.epsilon, maxval=self.epsilon) 148 | start_from_noise_index = tf.cast(tf.greater( 149 | tf.random_uniform(shape=[]), self.prob_start_from_clean), tf.float32) 150 | start_adv = image_clean + start_from_noise_index * init_start 151 | 152 | if self.USE_XLA: 153 | assert tuple(map(int, tf.__version__.split('.')[:2])) >= (1, 12) 154 | from tensorflow.contrib.compiler import xla 155 | with tf.name_scope('attack_loop'): 156 | adv_final = tf.while_loop( 157 | lambda _: True, 158 | one_step_attack if not self.USE_XLA else 159 | lambda adv: xla.compile(lambda: one_step_attack(adv))[0], 160 | [start_adv], 161 | back_prop=False, 162 | maximum_iterations=self.num_iter, 163 | parallel_iterations=1) 164 | return adv_final, target_label 165 | 166 | 167 | class AdvImageNetModel(ImageNetModel): 168 | 169 | """ 170 | Feature Denoising, Sec 5: 171 | A label smoothing of 0.1 is used. 172 | """ 173 | label_smoothing = 0.1 174 | 175 | def set_attacker(self, attacker): 176 | self.attacker = attacker 177 | 178 | def build_graph(self, image, label): 179 | """ 180 | The default tower function. 181 | """ 182 | image = self.image_preprocess(image) 183 | assert self.data_format == 'NCHW' 184 | image = tf.transpose(image, [0, 3, 1, 2]) 185 | 186 | with tf.variable_scope(tf.get_variable_scope(), reuse=tf.AUTO_REUSE): 187 | # BatchNorm always comes with trouble. We use the testing mode of it during attack. 188 | with freeze_collection([tf.GraphKeys.UPDATE_OPS]), argscope(BatchNorm, training=False): 189 | image, target_label = self.attacker.attack(image, label, self.get_logits) 190 | image = tf.stop_gradient(image, name='adv_training_sample') 191 | 192 | logits = self.get_logits(image) 193 | 194 | loss = ImageNetModel.compute_loss_and_error( 195 | logits, label, label_smoothing=self.label_smoothing) 196 | AdvImageNetModel.compute_attack_success(logits, target_label) 197 | if not self.training: 198 | return 199 | 200 | wd_loss = regularize_cost(self.weight_decay_pattern, 201 | tf.contrib.layers.l2_regularizer(self.weight_decay), 202 | name='l2_regularize_loss') 203 | add_moving_summary(loss, wd_loss) 204 | total_cost = tf.add_n([loss, wd_loss], name='cost') 205 | 206 | if self.loss_scale != 1.: 207 | logger.info("Scaling the total loss by {} ...".format(self.loss_scale)) 208 | return total_cost * self.loss_scale 209 | else: 210 | return total_cost 211 | 212 | def get_inference_func(self, attacker): 213 | """ 214 | Returns a tower function to be used for inference. It generates adv 215 | images with the given attacker and runs classification on it. 216 | """ 217 | 218 | def tower_func(image, label): 219 | assert not self.training 220 | image = self.image_preprocess(image) 221 | image = tf.transpose(image, [0, 3, 1, 2]) 222 | image, target_label = attacker.attack(image, label, self.get_logits) 223 | logits = self.get_logits(image) 224 | ImageNetModel.compute_loss_and_error(logits, label) # compute top-1 and top-5 225 | AdvImageNetModel.compute_attack_success(logits, target_label) 226 | 227 | return TowerFunc(tower_func, self.get_input_signature()) 228 | 229 | def image_preprocess(self, image): 230 | with tf.name_scope('image_preprocess'): 231 | if image.dtype.base_dtype != tf.float32: 232 | image = tf.cast(image, tf.float32) 233 | # For the purpose of adversarial training, normalize images to [-1, 1] 234 | image = image * IMAGE_SCALE - 1.0 235 | return image 236 | 237 | @staticmethod 238 | def compute_attack_success(logits, target_label): 239 | """ 240 | Compute the attack success rate. 241 | """ 242 | pred = tf.argmax(logits, axis=1, output_type=tf.int32) 243 | equal_target = tf.equal(pred, target_label) 244 | success = tf.cast(equal_target, tf.float32, name='attack_success') 245 | add_moving_summary(tf.reduce_mean(success, name='attack_success_rate')) 246 | -------------------------------------------------------------------------------- /inference-example.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # Copyright (c) Facebook, Inc. and its affiliates. 4 | # All rights reserved. 5 | # 6 | # This source code is licensed under the license found in the 7 | # LICENSE file in the root directory of this source tree. 8 | 9 | import argparse 10 | import numpy as np 11 | import cv2 12 | import tensorflow as tf 13 | 14 | from tensorpack import TowerContext 15 | from tensorpack.tfutils import get_model_loader 16 | from tensorpack.dataflow.dataset import ILSVRCMeta 17 | 18 | import nets 19 | 20 | """ 21 | A small inference example for attackers to play with. 22 | """ 23 | 24 | 25 | parser = argparse.ArgumentParser() 26 | parser.add_argument('-d', '--depth', help='ResNet depth', 27 | type=int, default=152, choices=[50, 101, 152]) 28 | parser.add_argument('--arch', help='Name of architectures defined in nets.py', 29 | default='ResNetDenoise') 30 | parser.add_argument('--load', help='path to checkpoint') 31 | parser.add_argument('--input', help='path to input image') 32 | args = parser.parse_args() 33 | 34 | model = getattr(nets, args.arch + 'Model')(args) 35 | 36 | input = tf.placeholder(tf.float32, shape=(None, 224, 224, 3)) 37 | image = input / 127.5 - 1.0 38 | image = tf.transpose(image, [0, 3, 1, 2]) 39 | with TowerContext('', is_training=False): 40 | logits = model.get_logits(image) 41 | 42 | sess = tf.Session() 43 | get_model_loader(args.load).init(sess) 44 | 45 | sample = cv2.imread(args.input) # this is a BGR image, not RGB 46 | # imagenet evaluation uses standard imagenet pre-processing 47 | # (resize shortest edge to 256 + center crop 224). 48 | # However, for images of unknown sources, let's just do a naive resize. 49 | sample = cv2.resize(sample, (224, 224)) 50 | 51 | prob = sess.run(logits, feed_dict={input: np.array([sample])}) 52 | print("Prediction: ", prob.argmax()) 53 | 54 | synset = ILSVRCMeta().get_synset_words_1000() 55 | print("Top 5: ", [synset[k] for k in prob[0].argsort()[-5:][::-1]]) 56 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # Copyright (c) Facebook, Inc. and its affiliates. 4 | # All rights reserved. 5 | # 6 | # This source code is licensed under the license found in the 7 | # LICENSE file in the root directory of this source tree. 8 | 9 | 10 | import argparse 11 | import cv2 12 | import glob 13 | import numpy as np 14 | import os 15 | import socket 16 | import sys 17 | 18 | import horovod.tensorflow as hvd 19 | 20 | from tensorpack import * 21 | from tensorpack.tfutils import SmartInit 22 | 23 | import nets 24 | from adv_model import NoOpAttacker, PGDAttacker 25 | from third_party.imagenet_utils import get_val_dataflow, eval_on_ILSVRC12 26 | from third_party.utils import HorovodClassificationError 27 | 28 | 29 | def create_eval_callback(name, tower_func, condition): 30 | """ 31 | Create a distributed evaluation callback. 32 | 33 | Args: 34 | name (str): a prefix 35 | tower_func (TowerFunc): the inference tower function 36 | condition: a function(epoch number) that returns whether this epoch should evaluate or not 37 | """ 38 | dataflow = get_val_dataflow( 39 | args.data, args.batch, 40 | num_splits=hvd.size(), split_index=hvd.rank()) 41 | # We eval both the classification error rate (for comparison with defenders) 42 | # and the attack success rate (for comparison with attackers). 43 | infs = [HorovodClassificationError('wrong-top1', '{}-top1-error'.format(name)), 44 | HorovodClassificationError('wrong-top5', '{}-top5-error'.format(name)), 45 | HorovodClassificationError('attack_success', '{}-attack-success-rate'.format(name)) 46 | ] 47 | cb = InferenceRunner( 48 | QueueInput(dataflow), infs, 49 | tower_name=name, 50 | tower_func=tower_func).set_chief_only(False) 51 | cb = EnableCallbackIf( 52 | cb, lambda self: condition(self.epoch_num)) 53 | return cb 54 | 55 | 56 | def do_train(model): 57 | batch = args.batch 58 | total_batch = batch * hvd.size() 59 | 60 | if args.fake: 61 | data = FakeData( 62 | [[batch, 224, 224, 3], [batch]], 1000, 63 | random=False, dtype=['uint8', 'int32']) 64 | data = StagingInput(QueueInput(data)) 65 | callbacks = [] 66 | steps_per_epoch = 50 67 | else: 68 | logger.info("#Tower: {}; Batch size per tower: {}".format(hvd.size(), batch)) 69 | zmq_addr = 'ipc://@imagenet-train-b{}'.format(batch) 70 | if args.no_zmq_ops: 71 | dataflow = RemoteDataZMQ(zmq_addr, hwm=150, bind=False) 72 | data = QueueInput(dataflow) 73 | else: 74 | data = ZMQInput(zmq_addr, 30, bind=False) 75 | data = StagingInput(data) 76 | 77 | steps_per_epoch = int(np.round(1281167 / total_batch)) 78 | 79 | BASE_LR = 0.1 * (total_batch // 256) 80 | """ 81 | ImageNet in 1 Hour, Sec 2.1: 82 | Linear Scaling Rule: When the minibatch size is 83 | multiplied by k, multiply the learning rate by k. 84 | """ 85 | logger.info("Base LR: {}".format(BASE_LR)) 86 | callbacks = [ 87 | ModelSaver(max_to_keep=10), 88 | EstimatedTimeLeft(), 89 | ScheduledHyperParamSetter( 90 | 'learning_rate', [(0, BASE_LR), (35, BASE_LR * 1e-1), (70, BASE_LR * 1e-2), 91 | (95, BASE_LR * 1e-3)]) 92 | ] 93 | """ 94 | Feature Denoising, Sec 5: 95 | Our models are trained for a total of 96 | 110 epochs; we decrease the learning rate by 10× at the 35- 97 | th, 70-th, and 95-th epoch 98 | """ 99 | max_epoch = 110 100 | 101 | if BASE_LR > 0.1: 102 | callbacks.append( 103 | ScheduledHyperParamSetter( 104 | 'learning_rate', [(0, 0.1), (5 * steps_per_epoch, BASE_LR)], 105 | interp='linear', step_based=True)) 106 | """ 107 | ImageNet in 1 Hour, Sec 2.2: 108 | we start from a learning rate of η and increment it by a constant amount at 109 | each iteration such that it reaches ηˆ = kη after 5 epochs 110 | """ 111 | 112 | if not args.fake: 113 | # add distributed evaluation, for various attackers that we care. 114 | def add_eval_callback(name, attacker, condition): 115 | cb = create_eval_callback( 116 | name, 117 | model.get_inference_func(attacker), 118 | # always eval in the last 2 epochs no matter what 119 | lambda epoch_num: condition(epoch_num) or epoch_num > max_epoch - 2) 120 | callbacks.append(cb) 121 | 122 | add_eval_callback('eval-clean', NoOpAttacker(), lambda e: True) 123 | add_eval_callback('eval-10step', PGDAttacker(10, args.attack_epsilon, args.attack_step_size), 124 | lambda e: True) 125 | add_eval_callback('eval-50step', PGDAttacker(50, args.attack_epsilon, args.attack_step_size), 126 | lambda e: e % 20 == 0) 127 | add_eval_callback('eval-100step', PGDAttacker(100, args.attack_epsilon, args.attack_step_size), 128 | lambda e: e % 10 == 0 or e > max_epoch - 5) 129 | for k in [20, 30, 40, 60, 70, 80, 90]: 130 | add_eval_callback('eval-{}step'.format(k), 131 | PGDAttacker(k, args.attack_epsilon, args.attack_step_size), 132 | lambda e: False) 133 | 134 | trainer = HorovodTrainer(average=True) 135 | trainer.setup_graph(model.get_input_signature(), data, model.build_graph, model.get_optimizer) 136 | trainer.train_with_defaults( 137 | callbacks=callbacks, 138 | steps_per_epoch=steps_per_epoch, 139 | session_init=SmartInit(args.load), 140 | max_epoch=max_epoch, 141 | starting_epoch=args.starting_epoch) 142 | 143 | 144 | if __name__ == '__main__': 145 | parser = argparse.ArgumentParser() 146 | parser.add_argument('--load', help='Path to a model to load for evaluation or resuming training.') 147 | parser.add_argument('--starting-epoch', help='The epoch to start with. Useful when resuming training.', 148 | type=int, default=1) 149 | parser.add_argument('--logdir', help='Directory suffix for models and training stats.') 150 | parser.add_argument('--eval', action='store_true', help='Evaluate a model on ImageNet instead of training.') 151 | 152 | # run on a directory of images: 153 | parser.add_argument('--eval-directory', help='Path to a directory of images to classify.') 154 | parser.add_argument('--prediction-file', help='Path to a txt file to write predictions.', default='predictions.txt') 155 | 156 | parser.add_argument('--data', help='ILSVRC dataset dir') 157 | parser.add_argument('--fake', help='Use fakedata to test or benchmark this model', action='store_true') 158 | parser.add_argument('--no-zmq-ops', help='Use pure python to send/receive data', 159 | action='store_true') 160 | parser.add_argument('--batch', help='Per-GPU batch size', default=32, type=int) 161 | 162 | # attacker flags: 163 | parser.add_argument('--attack-iter', help='Adversarial attack iteration', 164 | type=int, default=30) 165 | parser.add_argument('--attack-epsilon', help='Adversarial attack maximal perturbation', 166 | type=float, default=16.0) 167 | parser.add_argument('--attack-step-size', help='Adversarial attack step size', 168 | type=float, default=1.0) 169 | parser.add_argument('--use-fp16xla', 170 | help='Optimize PGD with fp16+XLA in training or evaluation. ' 171 | '(Evaluation during training will still use FP32, for fair comparison)', 172 | action='store_true') 173 | 174 | # architecture flags: 175 | parser.add_argument('-d', '--depth', help='ResNet depth', 176 | type=int, default=50, choices=[50, 101, 152]) 177 | parser.add_argument('--arch', help='Name of architectures defined in nets.py', 178 | default='ResNet') 179 | args = parser.parse_args() 180 | 181 | # Define model 182 | model = getattr(nets, args.arch + 'Model')(args) 183 | 184 | # Define attacker 185 | if args.attack_iter == 0 or args.eval_directory: 186 | attacker = NoOpAttacker() 187 | else: 188 | attacker = PGDAttacker( 189 | args.attack_iter, args.attack_epsilon, args.attack_step_size, 190 | prob_start_from_clean=0.2 if not args.eval else 0.0) 191 | if args.use_fp16xla: 192 | attacker.USE_FP16 = True 193 | attacker.USE_XLA = True 194 | model.set_attacker(attacker) 195 | 196 | os.system("nvidia-smi") 197 | hvd.init() 198 | 199 | if args.eval: 200 | sessinit = SmartInit(args.load) 201 | if hvd.size() == 1: 202 | # single-GPU eval, slow 203 | ds = get_val_dataflow(args.data, args.batch) 204 | eval_on_ILSVRC12(model, sessinit, ds) 205 | else: 206 | logger.info("CMD: " + " ".join(sys.argv)) 207 | cb = create_eval_callback( 208 | "eval", 209 | model.get_inference_func(attacker), 210 | lambda e: True) 211 | trainer = HorovodTrainer() 212 | trainer.setup_graph(model.get_input_signature(), PlaceholderInput(), model.build_graph, model.get_optimizer) 213 | # train for an empty epoch, to reuse the distributed evaluation code 214 | trainer.train_with_defaults( 215 | callbacks=[cb], 216 | monitors=[ScalarPrinter()] if hvd.rank() == 0 else [], 217 | session_init=sessinit, 218 | steps_per_epoch=0, max_epoch=1) 219 | elif args.eval_directory: 220 | assert hvd.size() == 1 221 | files = glob.glob(os.path.join(args.eval_directory, '*.*')) 222 | ds = ImageFromFile(files) 223 | # Our model expects BGR images instead of RGB. 224 | # Also do a naive resize to 224. 225 | ds = MapData( 226 | ds, 227 | lambda dp: [cv2.resize(dp[0][:, :, ::-1], (224, 224), interpolation=cv2.INTER_CUBIC)]) 228 | ds = BatchData(ds, args.batch, remainder=True) 229 | 230 | pred_config = PredictConfig( 231 | model=model, 232 | session_init=SmartInit(args.load), 233 | input_names=['input'], 234 | output_names=['linear/output'] # the logits 235 | ) 236 | predictor = SimpleDatasetPredictor(pred_config, ds) 237 | 238 | logger.info("Running inference on {} images in {}".format(len(files), args.eval_directory)) 239 | results = [] 240 | for logits, in predictor.get_result(): 241 | predictions = list(np.argmax(logits, axis=1)) 242 | results.extend(predictions) 243 | assert len(results) == len(files) 244 | with open(args.prediction_file, "w") as f: 245 | for filename, label in zip(files, results): 246 | f.write("{},{}\n".format(filename, label)) 247 | logger.info("Outputs saved to " + args.prediction_file) 248 | else: 249 | logger.info("Training on {}".format(socket.gethostname())) 250 | logdir = os.path.join( 251 | 'train_log', 252 | 'PGD-{}{}-Batch{}-{}GPUs-iter{}-epsilon{}-step{}{}'.format( 253 | args.arch, args.depth, args.batch, hvd.size(), 254 | args.attack_iter, args.attack_epsilon, args.attack_step_size, 255 | '-' + args.logdir if args.logdir else '')) 256 | 257 | if hvd.rank() == 0: 258 | # old log directory will be automatically removed. 259 | logger.set_logger_dir(logdir, 'd') 260 | logger.info("CMD: " + " ".join(sys.argv)) 261 | logger.info("Rank={}, Local Rank={}, Size={}".format(hvd.rank(), hvd.local_rank(), hvd.size())) 262 | 263 | do_train(model) 264 | -------------------------------------------------------------------------------- /nets.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | 7 | from adv_model import AdvImageNetModel 8 | from resnet_model import ( 9 | resnet_group, resnet_bottleneck, resnet_backbone) 10 | from resnet_model import denoising 11 | 12 | 13 | NUM_BLOCKS = { 14 | 50: [3, 4, 6, 3], 15 | 101: [3, 4, 23, 3], 16 | 152: [3, 8, 36, 3] 17 | } 18 | 19 | 20 | class ResNetModel(AdvImageNetModel): 21 | def __init__(self, args): 22 | self.num_blocks = NUM_BLOCKS[args.depth] 23 | 24 | def get_logits(self, image): 25 | return resnet_backbone(image, self.num_blocks, resnet_group, resnet_bottleneck) 26 | 27 | 28 | class ResNetDenoiseModel(AdvImageNetModel): 29 | def __init__(self, args): 30 | self.num_blocks = NUM_BLOCKS[args.depth] 31 | 32 | def get_logits(self, image): 33 | 34 | def group_func(name, *args): 35 | """ 36 | Feature Denoising, Sec 6: 37 | we add 4 denoising blocks to a ResNet: each is added after the 38 | last residual block of res2, res3, res4, and res5, respectively. 39 | """ 40 | l = resnet_group(name, *args) 41 | l = denoising(name + '_denoise', l, embed=True, softmax=True) 42 | return l 43 | 44 | return resnet_backbone(image, self.num_blocks, group_func, resnet_bottleneck) 45 | 46 | 47 | class ResNeXtDenoiseAllModel(AdvImageNetModel): 48 | """ 49 | ResNeXt 32x8d that performs denoising after every residual block. 50 | """ 51 | def __init__(self, args): 52 | self.num_blocks = NUM_BLOCKS[args.depth] 53 | 54 | def get_logits(self, image): 55 | 56 | def block_func(l, ch_out, stride): 57 | """ 58 | Feature Denoising, Sec 6.2: 59 | The winning entry, shown in the blue bar, was based on our method by using 60 | a ResNeXt101-32×8 backbone 61 | with non-local denoising blocks added to all residual blocks. 62 | """ 63 | l = resnet_bottleneck(l, ch_out, stride, group=32, res2_bottleneck=8) 64 | l = denoising('non_local', l, embed=False, softmax=False) 65 | return l 66 | 67 | return resnet_backbone(image, self.num_blocks, resnet_group, block_func) 68 | -------------------------------------------------------------------------------- /resnet_model.py: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | 7 | import tensorflow as tf 8 | from tensorpack.tfutils.argscope import argscope 9 | from tensorpack.models import ( 10 | Conv2D, MaxPooling, AvgPooling, GlobalAvgPooling, BatchNorm, FullyConnected, BNReLU) 11 | 12 | 13 | def resnet_shortcut(l, n_out, stride, activation=tf.identity): 14 | n_in = l.get_shape().as_list()[1] 15 | if n_in != n_out: # change dimension when channel is not the same 16 | return Conv2D('convshortcut', l, n_out, 1, strides=stride, activation=activation) 17 | else: 18 | return l 19 | 20 | 21 | def get_bn(zero_init=False): 22 | if zero_init: 23 | return lambda x, name=None: BatchNorm('bn', x, gamma_initializer=tf.zeros_initializer()) 24 | else: 25 | return lambda x, name=None: BatchNorm('bn', x) 26 | 27 | 28 | def resnet_bottleneck(l, ch_out, stride, group=1, res2_bottleneck=64): 29 | """ 30 | Args: 31 | group (int): the number of groups for resnext 32 | res2_bottleneck (int): the number of channels in res2 bottleneck. 33 | The default corresponds to ResNeXt 1x64d, i.e. vanilla ResNet. 34 | """ 35 | ch_factor = res2_bottleneck * group // 64 36 | shortcut = l 37 | l = Conv2D('conv1', l, ch_out * ch_factor, 1, strides=1, activation=BNReLU) 38 | l = Conv2D('conv2', l, ch_out * ch_factor, 3, strides=stride, activation=BNReLU, split=group) 39 | """ 40 | ImageNet in 1 Hour, Sec 5.1: 41 | the stride-2 convolutions are on 3×3 layers instead of on 1×1 layers 42 | """ 43 | l = Conv2D('conv3', l, ch_out * 4, 1, activation=get_bn(zero_init=True)) 44 | """ 45 | ImageNet in 1 Hour, Sec 5.1: each residual block's last BN where γ is initialized to be 0 46 | """ 47 | ret = l + resnet_shortcut(shortcut, ch_out * 4, stride, activation=get_bn(zero_init=False)) 48 | return tf.nn.relu(ret, name='block_output') 49 | 50 | 51 | def resnet_group(name, l, block_func, features, count, stride): 52 | with tf.variable_scope(name): 53 | for i in range(0, count): 54 | with tf.variable_scope('block{}'.format(i)): 55 | current_stride = stride if i == 0 else 1 56 | l = block_func(l, features, current_stride) 57 | return l 58 | 59 | 60 | def resnet_backbone(image, num_blocks, group_func, block_func): 61 | with argscope([Conv2D, MaxPooling, AvgPooling, GlobalAvgPooling, BatchNorm], data_format='NCHW'), \ 62 | argscope(Conv2D, use_bias=False, 63 | kernel_initializer=tf.variance_scaling_initializer(scale=2.0, mode='fan_out')): 64 | l = Conv2D('conv0', image, 64, 7, strides=2, activation=BNReLU) 65 | l = MaxPooling('pool0', l, pool_size=3, strides=2, padding='SAME') 66 | l = group_func('group0', l, block_func, 64, num_blocks[0], 1) 67 | l = group_func('group1', l, block_func, 128, num_blocks[1], 2) 68 | l = group_func('group2', l, block_func, 256, num_blocks[2], 2) 69 | l = group_func('group3', l, block_func, 512, num_blocks[3], 2) 70 | l = GlobalAvgPooling('gap', l) 71 | logits = FullyConnected('linear', l, 1000, 72 | kernel_initializer=tf.random_normal_initializer(stddev=0.01)) 73 | """ 74 | ImageNet in 1 Hour, Sec 5.1: 75 | The 1000-way fully-connected layer is initialized by 76 | drawing weights from a zero-mean Gaussian with standard deviation of 0.01 77 | """ 78 | return logits 79 | 80 | 81 | def denoising(name, l, embed=True, softmax=True): 82 | """ 83 | Feature Denoising, Fig 4 & 5. 84 | """ 85 | with tf.variable_scope(name): 86 | f = non_local_op(l, embed=embed, softmax=softmax) 87 | f = Conv2D('conv', f, l.shape[1], 1, strides=1, activation=get_bn(zero_init=True)) 88 | l = l + f 89 | return l 90 | 91 | 92 | def non_local_op(l, embed, softmax): 93 | """ 94 | Feature Denoising, Sec 4.2 & Fig 5. 95 | Args: 96 | embed (bool): whether to use embedding on theta & phi 97 | softmax (bool): whether to use gaussian (softmax) version or the dot-product version. 98 | """ 99 | n_in, H, W = l.shape.as_list()[1:] 100 | if embed: 101 | theta = Conv2D('embedding_theta', l, n_in / 2, 1, 102 | strides=1, kernel_initializer=tf.random_normal_initializer(stddev=0.01)) 103 | phi = Conv2D('embedding_phi', l, n_in / 2, 1, 104 | strides=1, kernel_initializer=tf.random_normal_initializer(stddev=0.01)) 105 | g = l 106 | else: 107 | theta, phi, g = l, l, l 108 | if n_in > H * W or softmax: 109 | f = tf.einsum('niab,nicd->nabcd', theta, phi) 110 | if softmax: 111 | orig_shape = tf.shape(f) 112 | f = tf.reshape(f, [-1, H * W, H * W]) 113 | f = f / tf.sqrt(tf.cast(theta.shape[1], theta.dtype)) 114 | f = tf.nn.softmax(f) 115 | f = tf.reshape(f, orig_shape) 116 | f = tf.einsum('nabcd,nicd->niab', f, g) 117 | else: 118 | f = tf.einsum('nihw,njhw->nij', phi, g) 119 | f = tf.einsum('nij,nihw->njhw', f, theta) 120 | if not softmax: 121 | f = f / tf.cast(H * W, f.dtype) 122 | return tf.reshape(f, tf.shape(l)) 123 | -------------------------------------------------------------------------------- /slurm/eval.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | # Copyright (c) Facebook, Inc. and its affiliates. 3 | # All rights reserved. 4 | # 5 | # This source code is licensed under the license found in the 6 | # LICENSE file in the root directory of this source tree. 7 | #SBATCH --output=logs/job-%j.%N.out 8 | #SBATCH --error=logs/job-%j.%N.err 9 | #SBATCH --ntasks-per-node=8 # 8 tasks per node 10 | #SBATCH --gres=gpu:8 # 8 GPUs per node 11 | #SBATCH --cpus-per-task=10 # 80/8 cpus per task 12 | #SBATCH --mem=200G # ask for 200G 13 | 14 | # To run on 4 nodes x 8 GPUs: use "mkdir -p logs && sbatch --nodes=4 slurm.script" 15 | 16 | echo "NNODES: $SLURM_NNODES" 17 | echo "JOBID: $SLURM_JOB_ID" 18 | env | grep PATH 19 | 20 | export TENSORPACK_PROGRESS_REFRESH=20 21 | export TENSORPACK_SERIALIZE=msgpack 22 | 23 | DATA_PATH=~/data/imagenet 24 | BATCH=32 25 | CONFIG=$1 26 | 27 | # launch eval 28 | # https://www.open-mpi.org/faq/?category=openfabrics#ib-router has document on IB options 29 | # the queue parameters sometimes can hang the communication (for some MPI versions and some operations) 30 | mpirun -output-filename logs/eval-$SLURM_JOB_ID.log -tag-output \ 31 | -bind-to none -map-by slot \ 32 | -mca pml ob1 -mca btl_openib_receive_queues P,128,32:P,2048,32:P,12288,32:P,65536,32 \ 33 | -x NCCL_IB_CUDA_SUPPORT=1 -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO \ 34 | python ./main.py --eval --data $DATA_PATH --batch $BATCH $CONFIG 35 | -------------------------------------------------------------------------------- /slurm/train.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash -e 2 | # Copyright (c) Facebook, Inc. and its affiliates. 3 | # All rights reserved. 4 | # 5 | # This source code is licensed under the license found in the 6 | # LICENSE file in the root directory of this source tree. 7 | #SBATCH --output=logs/job-%j.%N.out 8 | #SBATCH --error=logs/job-%j.%N.err 9 | #SBATCH --ntasks-per-node=8 # 8 tasks per node 10 | #SBATCH --gres=gpu:8 # 8 GPUs per node 11 | #SBATCH --cpus-per-task=10 # 80/8 cpus per task 12 | #SBATCH --mem=200G 13 | 14 | # To run on 4 nodes x 8 GPUs: use "mkdir -p logs && sbatch --nodes=4 slurm.script" 15 | 16 | echo "NNODES: $SLURM_NNODES" 17 | echo "JOBID: $SLURM_JOB_ID" 18 | env | grep PATH 19 | 20 | export TENSORPACK_PROGRESS_REFRESH=20 21 | export TENSORPACK_SERIALIZE=msgpack 22 | 23 | DATA_PATH=~/data/imagenet 24 | BATCH=32 25 | CONFIG=$1 26 | 27 | # launch data 28 | srun --output=logs/data-%J.%N.log \ 29 | --error=logs/data-%J.%N.err \ 30 | --gres=gpu:0 --cpus-per-task=60 --mincpus 60 \ 31 | --ntasks=$SLURM_NNODES --ntasks-per-node=1 \ 32 | python ./third_party/serve-data.py --data $DATA_PATH --batch $BATCH & 33 | DATA_PID=$! 34 | 35 | # launch training 36 | # https://www.open-mpi.org/faq/?category=openfabrics#ib-router has document on IB options 37 | # the queue parameters sometimes can hang the communication (for some MPI versions and some operations) 38 | #-mca btl tcp,self \ 39 | mpirun -output-filename logs/train-$SLURM_JOB_ID.log -tag-output \ 40 | -bind-to none -map-by slot \ 41 | -mca pml ob1 -mca btl_openib_receive_queues P,128,32:P,2048,32:P,12288,32:P,65536,32 \ 42 | -x NCCL_IB_CUDA_SUPPORT=1 -x NCCL_IB_DISABLE=0 -x NCCL_DEBUG=INFO \ 43 | python ./main.py --data $DATA_PATH --batch $BATCH $CONFIG & 44 | MPI_PID=$! 45 | 46 | wait $MPI_PID 47 | -------------------------------------------------------------------------------- /teaser.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/ImageNet-Adversarial-Training/1ad68f08b8533083b0b8823ac3fd85cede191646/teaser.jpg -------------------------------------------------------------------------------- /third_party/README.md: -------------------------------------------------------------------------------- 1 | 2 | Utilities for ImageNet training & distributed evaluation. 3 | 4 | Copied from https://github.com/tensorpack/benchmarks/tree/master/ResNet-Horovod. 5 | -------------------------------------------------------------------------------- /third_party/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/facebookresearch/ImageNet-Adversarial-Training/1ad68f08b8533083b0b8823ac3fd85cede191646/third_party/__init__.py -------------------------------------------------------------------------------- /third_party/imagenet_utils.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # File: imagenet_utils.py 4 | 5 | 6 | import multiprocessing 7 | import numpy as np 8 | from abc import abstractmethod 9 | 10 | import cv2 11 | import tensorflow as tf 12 | 13 | from tensorpack import imgaug, dataset, ModelDesc 14 | from tensorpack.dataflow import ( 15 | BatchData, MultiThreadMapData, DataFromList) 16 | from tensorpack.predict import PredictConfig, SimpleDatasetPredictor 17 | from tensorpack.utils.stats import RatioCounter 18 | from tensorpack.models import regularize_cost 19 | from tensorpack.tfutils.summary import add_moving_summary 20 | from tensorpack.utils import logger 21 | 22 | 23 | def fbresnet_augmentor(isTrain): 24 | """ 25 | Augmentor used in fb.resnet.torch, for BGR images in range [0,255]. 26 | """ 27 | if isTrain: 28 | augmentors = [ 29 | imgaug.GoogleNetRandomCropAndResize(interp=cv2.INTER_CUBIC), 30 | # It's OK to remove the following augs if your CPU is not fast enough. 31 | # Removing brightness/contrast/saturation does not have a significant effect on accuracy. 32 | # Removing lighting leads to a tiny drop in accuracy. 33 | imgaug.RandomOrderAug( 34 | [imgaug.BrightnessScale((0.6, 1.4), clip=False), 35 | imgaug.Contrast((0.6, 1.4), clip=False), 36 | imgaug.Saturation(0.4, rgb=False), 37 | # rgb-bgr conversion for the constants copied from fb.resnet.torch 38 | imgaug.Lighting(0.1, 39 | eigval=np.asarray( 40 | [0.2175, 0.0188, 0.0045][::-1]) * 255.0, 41 | eigvec=np.array( 42 | [[-0.5675, 0.7192, 0.4009], 43 | [-0.5808, -0.0045, -0.8140], 44 | [-0.5836, -0.6948, 0.4203]], 45 | dtype='float32')[::-1, ::-1] 46 | )]), 47 | imgaug.Flip(horiz=True), 48 | ] 49 | else: 50 | augmentors = [ 51 | imgaug.ResizeShortestEdge(256, cv2.INTER_CUBIC), 52 | imgaug.CenterCrop((224, 224)), 53 | ] 54 | return augmentors 55 | 56 | 57 | def get_val_dataflow( 58 | datadir, batch_size, 59 | augmentors=None, parallel=None, 60 | num_splits=None, split_index=None): 61 | if augmentors is None: 62 | augmentors = fbresnet_augmentor(False) 63 | assert datadir is not None 64 | assert isinstance(augmentors, list) 65 | if parallel is None: 66 | parallel = min(40, multiprocessing.cpu_count()) 67 | 68 | if num_splits is None: 69 | ds = dataset.ILSVRC12Files(datadir, 'val', shuffle=False) 70 | else: 71 | # shard validation data 72 | assert split_index < num_splits 73 | files = dataset.ILSVRC12Files(datadir, 'val', shuffle=False) 74 | files.reset_state() 75 | files = list(files.get_data()) 76 | logger.info("Number of validation data = {}".format(len(files))) 77 | split_size = len(files) // num_splits 78 | start, end = split_size * split_index, split_size * (split_index + 1) 79 | end = min(end, len(files)) 80 | logger.info("Local validation split = {} - {}".format(start, end)) 81 | files = files[start: end] 82 | ds = DataFromList(files, shuffle=False) 83 | aug = imgaug.AugmentorList(augmentors) 84 | 85 | def mapf(dp): 86 | fname, cls = dp 87 | im = cv2.imread(fname, cv2.IMREAD_COLOR) 88 | im = aug.augment(im) 89 | return im, cls 90 | ds = MultiThreadMapData(ds, parallel, mapf, 91 | buffer_size=min(2000, ds.size()), strict=True) 92 | ds = BatchData(ds, batch_size, remainder=True) 93 | # do not fork() under MPI 94 | return ds 95 | 96 | 97 | def eval_on_ILSVRC12(model, sessinit, dataflow): 98 | pred_config = PredictConfig( 99 | model=model, 100 | session_init=sessinit, 101 | input_names=['input', 'label'], 102 | output_names=['wrong-top1', 'wrong-top5', 'attack_success'] 103 | ) 104 | pred = SimpleDatasetPredictor(pred_config, dataflow) 105 | acc1, acc5, succ = RatioCounter(), RatioCounter(), RatioCounter() 106 | for top1, top5, num_succ in pred.get_result(): 107 | batch_size = top1.shape[0] 108 | acc1.feed(top1.sum(), batch_size) 109 | acc5.feed(top5.sum(), batch_size) 110 | succ.feed(num_succ.sum(), batch_size) 111 | # Uncomment to monitor the metrics during evaluation 112 | # print("Top1 Error: {}".format(acc1.ratio)) 113 | # print("Attack Success Rate: {}".format(succ.ratio)) 114 | print("Top1 Error: {}".format(acc1.ratio)) 115 | print("Attack Success Rate: {}".format(succ.ratio)) 116 | print("Top5 Error: {}".format(acc5.ratio)) 117 | 118 | 119 | class ImageNetModel(ModelDesc): 120 | image_shape = 224 121 | 122 | """ 123 | uint8 instead of float32 is used as input type to reduce copy overhead. 124 | It might hurt the performance a liiiitle bit. 125 | """ 126 | image_dtype = tf.uint8 127 | 128 | """ 129 | Either 'NCHW' or 'NHWC' 130 | """ 131 | data_format = 'NCHW' 132 | 133 | """ 134 | Whether the image is BGR or RGB. If using DataFlow, then it should be BGR. 135 | """ 136 | image_bgr = True 137 | 138 | weight_decay = 1e-4 139 | 140 | """ 141 | To apply on normalization parameters, use '.*/W|.*/gamma|.*/beta' 142 | """ 143 | weight_decay_pattern = '.*/W' 144 | 145 | """ 146 | Scale the loss, for whatever reasons (e.g., gradient averaging, fp16 training, etc) 147 | """ 148 | loss_scale = 1. 149 | 150 | """ 151 | Label smoothing (See tf.losses.softmax_cross_entropy) 152 | """ 153 | label_smoothing = 0. 154 | 155 | def inputs(self): 156 | return [tf.placeholder(self.image_dtype, [None, self.image_shape, self.image_shape, 3], 'input'), 157 | tf.placeholder(tf.int32, [None], 'label')] 158 | 159 | def build_graph(self, image, label): 160 | image = self.image_preprocess(image) 161 | assert self.data_format == 'NCHW' 162 | image = tf.transpose(image, [0, 3, 1, 2]) 163 | 164 | logits = self.get_logits(image) 165 | loss = ImageNetModel.compute_loss_and_error( 166 | logits, label, label_smoothing=self.label_smoothing) 167 | 168 | if self.weight_decay > 0: 169 | wd_loss = regularize_cost(self.weight_decay_pattern, 170 | tf.contrib.layers.l2_regularizer(self.weight_decay), 171 | name='l2_regularize_loss') 172 | add_moving_summary(loss, wd_loss) 173 | total_cost = tf.add_n([loss, wd_loss], name='cost') 174 | else: 175 | total_cost = tf.identity(loss, name='cost') 176 | add_moving_summary(total_cost) 177 | 178 | if self.loss_scale != 1.: 179 | logger.info("Scaling the total loss by {} ...".format(self.loss_scale)) 180 | return total_cost * self.loss_scale 181 | else: 182 | return total_cost 183 | 184 | @abstractmethod 185 | def get_logits(self, image): 186 | """ 187 | Args: 188 | image: 4D tensor of ``self.input_shape`` in ``self.data_format`` 189 | 190 | Returns: 191 | Nx#class logits 192 | """ 193 | 194 | def optimizer(self): 195 | lr = tf.get_variable('learning_rate', initializer=0.1, trainable=False) 196 | tf.summary.scalar('learning_rate-summary', lr) 197 | return tf.train.MomentumOptimizer(lr, 0.9, use_nesterov=True) 198 | 199 | def image_preprocess(self, image): 200 | with tf.name_scope('image_preprocess'): 201 | if image.dtype.base_dtype != tf.float32: 202 | image = tf.cast(image, tf.float32) 203 | mean = [0.485, 0.456, 0.406] # rgb 204 | std = [0.229, 0.224, 0.225] 205 | if self.image_bgr: 206 | mean = mean[::-1] 207 | std = std[::-1] 208 | image_mean = tf.constant(mean, dtype=tf.float32) * 255. 209 | image_std = tf.constant(std, dtype=tf.float32) * 255. 210 | image = (image - image_mean) / image_std 211 | return image 212 | 213 | @staticmethod 214 | def compute_loss_and_error(logits, label, label_smoothing=0.): 215 | if label_smoothing == 0.: 216 | loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=label) 217 | else: 218 | nclass = logits.shape[-1] 219 | loss = tf.losses.softmax_cross_entropy( 220 | tf.one_hot(label, nclass), 221 | logits, label_smoothing=label_smoothing, 222 | reduction=tf.losses.Reduction.NONE) 223 | loss = tf.reduce_mean(loss, name='xentropy-loss') 224 | 225 | def prediction_incorrect(logits, label, topk=1, name='incorrect_vector'): 226 | with tf.name_scope('prediction_incorrect'): 227 | x = tf.logical_not(tf.nn.in_top_k(logits, label, topk)) 228 | return tf.cast(x, tf.float32, name=name) 229 | 230 | wrong = prediction_incorrect(logits, label, 1, name='wrong-top1') 231 | add_moving_summary(tf.reduce_mean(wrong, name='train-error-top1')) 232 | 233 | wrong = prediction_incorrect(logits, label, 5, name='wrong-top5') 234 | add_moving_summary(tf.reduce_mean(wrong, name='train-error-top5')) 235 | return loss 236 | -------------------------------------------------------------------------------- /third_party/serve-data.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | # -*- coding: utf-8 -*- 3 | # File: serve-data.py 4 | 5 | import argparse 6 | import os 7 | import multiprocessing as mp 8 | import socket 9 | 10 | from tensorpack.dataflow import ( 11 | send_dataflow_zmq, MapData, TestDataSpeed, FakeData, dataset, 12 | AugmentImageComponent, BatchData, PrefetchDataZMQ) 13 | from tensorpack.utils import logger 14 | from imagenet_utils import fbresnet_augmentor 15 | 16 | 17 | def get_data(batch, augmentors): 18 | """ 19 | Sec 3, Remark 4: 20 | Use a single random shuffling of the training data (per epoch) that is divided amongst all k workers. 21 | 22 | NOTE: Here we do not follow the paper, but it makes little differences. 23 | """ 24 | ds = dataset.ILSVRC12(args.data, 'train', shuffle=True) 25 | ds = AugmentImageComponent(ds, augmentors, copy=False) 26 | ds = BatchData(ds, batch, remainder=False) 27 | ds = PrefetchDataZMQ(ds, min(50, mp.cpu_count())) 28 | return ds 29 | 30 | 31 | if __name__ == '__main__': 32 | parser = argparse.ArgumentParser() 33 | parser.add_argument('--data', help='ILSVRC dataset dir') 34 | parser.add_argument('--fake', action='store_true') 35 | parser.add_argument('--batch', help='per-GPU batch size', 36 | default=32, type=int) 37 | parser.add_argument('--benchmark', action='store_true') 38 | parser.add_argument('--no-zmq-ops', action='store_true') 39 | args = parser.parse_args() 40 | 41 | os.environ['CUDA_VISIBLE_DEVICES'] = '' 42 | 43 | if args.fake: 44 | ds = FakeData( 45 | [[args.batch, 224, 224, 3], [args.batch]], 46 | 1000, random=False, dtype=['uint8', 'int32']) 47 | else: 48 | augs = fbresnet_augmentor(True) 49 | ds = get_data(args.batch, augs) 50 | 51 | logger.info("Serving data on {}".format(socket.gethostname())) 52 | 53 | if args.benchmark: 54 | from zmq_ops import dump_arrays 55 | ds = MapData(ds, dump_arrays) 56 | TestDataSpeed(ds, warmup=300).start() 57 | else: 58 | format = None if args.no_zmq_ops else 'zmq_ops' 59 | send_dataflow_zmq( 60 | ds, 'ipc://@imagenet-train-b{}'.format(args.batch), 61 | hwm=150, format=format, bind=True) 62 | -------------------------------------------------------------------------------- /third_party/utils.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import horovod.tensorflow as hvd 3 | import tensorflow as tf 4 | 5 | from tensorpack.callbacks import Inferencer 6 | from tensorpack.utils.stats import RatioCounter 7 | 8 | 9 | class HorovodClassificationError(Inferencer): 10 | """ 11 | Like ClassificationError, it evaluates total samples & count of incorrect or correct samples. 12 | But in the end we aggregate the total&count by horovod. 13 | """ 14 | def __init__(self, wrong_tensor_name, summary_name='validation_error'): 15 | """ 16 | Args: 17 | wrong_tensor_name(str): name of the ``wrong`` binary vector tensor. 18 | summary_name(str): the name to log the error with. 19 | """ 20 | self.wrong_tensor_name = wrong_tensor_name 21 | self.summary_name = summary_name 22 | 23 | def _setup_graph(self): 24 | self._placeholder = tf.placeholder(tf.float32, shape=[2], name='to_be_reduced') 25 | self._reduced = hvd.allreduce(self._placeholder, average=False) 26 | 27 | def _before_inference(self): 28 | self.err_stat = RatioCounter() 29 | 30 | def _get_fetches(self): 31 | return [self.wrong_tensor_name] 32 | 33 | def _on_fetches(self, outputs): 34 | vec = outputs[0] 35 | batch_size = len(vec) 36 | wrong = np.sum(vec) 37 | self.err_stat.feed(wrong, batch_size) 38 | # Uncomment this to monitor the metric during evaluation 39 | # print(self.summary_name, self.err_stat.ratio) 40 | 41 | def _after_inference(self): 42 | tot = self.err_stat.total 43 | cnt = self.err_stat.count 44 | tot, cnt = self._reduced.eval(feed_dict={self._placeholder: [tot, cnt]}) 45 | return {self.summary_name: cnt * 1. / tot} 46 | -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | # Copyright (c) Facebook, Inc. and its affiliates. 2 | # All rights reserved. 3 | # 4 | # This source code is licensed under the license found in the 5 | # LICENSE file in the root directory of this source tree. 6 | 7 | [flake8] 8 | max-line-length = 120 9 | ignore = F403,F405,E402,E741,E742,E743 10 | --------------------------------------------------------------------------------