├── .github
    └── workflows
    │   └── build.yml
├── .gitignore
├── .latexmkrc
├── Makefile
├── README.md
├── abstract.tex
├── iclr2019_conference.bst
├── iclr2019_conference.sty
├── paper.bib
├── paper.tex
└── version.sh


/.github/workflows/build.yml:
--------------------------------------------------------------------------------
 1 | name: CI
 2 | on:
 3 |   push:
 4 |     branches:
 5 |       - master
 6 | jobs:
 7 |   build:
 8 |     runs-on: ubuntu-latest
 9 |     steps:
10 |       - run: sudo apt-get update -y
11 |       - run: >-
12 |           sudo apt-get install -y
13 |           texlive-latex-extra
14 |           texlive-fonts-recommended
15 |           latexmk
16 |       - uses: actions/checkout@v2
17 |       - run: make paper.pdf
18 |       - id: create_release
19 |         uses: actions/create-release@v1
20 |         env:
21 |           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
22 |         with:
23 |           tag_name: release-${{ github.sha }}
24 |           release_name: Release ${{ github.sha }}
25 |           draft: false
26 |           prerelease: false
27 |       - uses: actions/upload-release-asset@v1
28 |         env:
29 |           GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
30 |         with:
31 |           upload_url: ${{ steps.create_release.outputs.upload_url }}
32 |           asset_path: ./paper.pdf
33 |           asset_name: paper.pdf
34 |           asset_content_type: application/pdf
35 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # output
 2 | *.dvi
 3 | *.ps
 4 | *.pdf
 5 | abstract.txt
 6 | 
 7 | # intermediate files
 8 | *.aux
 9 | *.log
10 | *.blg
11 | *.bbl
12 | *.ent
13 | *.out
14 | *.gz
15 | *.fls
16 | *.fdb_latexmk
17 | version.tex
18 | 


--------------------------------------------------------------------------------
/.latexmkrc:
--------------------------------------------------------------------------------
1 | $bibtex_use = 2;
2 | 


--------------------------------------------------------------------------------
/Makefile:
--------------------------------------------------------------------------------
 1 | # phony targets
 2 | 
 3 | all: paper.pdf abstract.txt
 4 | 
 5 | clean:
 6 | 	latexmk -pdf -C
 7 | 	rm -rf *.txt version.tex
 8 | 
 9 | .PHONY: all clean FORCE
10 | 
11 | # main targets
12 | 
13 | arxiv.tar.gz: paper.pdf # just build the paper because we want to build the .bbl
14 | 	tar czvf paper.tar.gz paper.tex paper.bbl $(DEPS)
15 | 
16 | %.txt: %.tex
17 | 	pandoc $< -o $@ -f latex -t plain --wrap=none
18 | 
19 | paper.pdf: FORCE
20 | 	sh version.sh version.tex
21 | 	latexmk -pdflatex='pdflatex -interaction nonstopmode' -pdf paper.tex
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # On Evaluating Adversarial Robustness
 2 | 
 3 | This repository contains the LaTeX source for the paper [On Evaluating Adversarial Robustness](https://github.com/evaluating-adversarial-robustness/adv-eval-paper/releases/latest/). It is a paper written with the intention of helping everyone---from those designing their own neural networks, to those reviewing defense papers, to those just wondering what goes into a defense evaluation---learn more about methods for evaluating adversarial robustness.
 4 | 
 5 | ## This is a Living Document
 6 | 
 7 | We do not intend for this to be a traditional paper where it is written once and never updated. While the fundamentals for how to evaluate adversarial robustness will not change, most of the specific advice we give today on evaluating adversarial robustness may quickly become out of date. We therefore expect to update this document from time to time in order to match the currently accepted best practices in the research community.
 8 | 
 9 | ### Abstract
10 | 
11 | Correctly evaluating defenses against adversarial examples has proven to be extremely difficult. Despite the significant amount of recent work attempting to design defenses that withstand adaptive attacks, few have succeeded; most papers that propose defenses are quickly shown to be incorrect.
12 | 
13 | We believe a large contributing factor is the difficulty of performing security evaluations. In this paper, we discuss the methodological foundations, review commonly accepted best practices, and suggest new methods for evaluating defenses to adversarial examples. We hope that both researchers developing defenses as well as readers and reviewers who wish to understand the completeness of an evaluation consider our advice in order to avoid common pitfalls.
14 | 
15 | 
16 | ## Contributing
17 | 
18 | We welcome any contributions to the paper through both issues and pull requests. Please prefer issues for topics which warrant initial discussion (such as suggesting a new item to be added to the checklist) and pull requests for changes that will require less discussion (fixing typos or writing content for a topic discussed previously in an issue).
19 | 
20 | 
21 | ### Contributors
22 | - Nicholas Carlini (Google Brain)
23 | - Anish Athalye (MIT)
24 | - Nicolas Papernot (Google Brain)
25 | - Wieland Brendel (University of Tubingen)
26 | - Jonas Rauber (University of Tubingen)
27 | - Dimitris Tsipras (MIT)
28 | - Ian Goodfellow (Google Brain)
29 | - Aleksander Madry (MIT)
30 | - Alexey Kurakin (Google Brain)
31 | 
32 | NOTE: contributors are ordered according to the amount of their contribution
33 | to the text of the paper, similar to the
34 | [Cleverhans tech report](https://github.com/tensorflow/cleverhans#authors).
35 | List of contributors may be expanded and order
36 | may change with the new revisions of the paper.
37 | 
38 | ## Changelog
39 | 
40 | 2018-02-20: Explain author order (#5)
41 | 
42 | 2018-02-18: Initial Revision
43 | 
44 | ## Citation
45 | 
46 | If you use this paper in academic research, you may cite the following:
47 | 
48 | ```
49 | @article{carlini2019evaluating,
50 |   title={On Evaluating Adversarial Robustness},
51 |   author={Carlini, Nicholas and Athalye, Anish and Papernot, Nicolas and Brendel, Wieland and Rauber, Jonas and Tsipras, Dimitris and Goodfellow, Ian and Madry, Aleksander and Kurakin, Alexey},
52 |   journal={arXiv preprint arXiv:1902.06705},
53 |   year={2019}
54 | }
55 | ```
56 | 


--------------------------------------------------------------------------------
/abstract.tex:
--------------------------------------------------------------------------------
 1 | Correctly evaluating defenses against adversarial examples has proven
 2 | to be extremely difficult.
 3 | %
 4 | Despite the significant amount of recent work attempting
 5 | to design defenses that withstand adaptive attacks, few have
 6 | succeeded; most papers that propose
 7 | defenses are quickly shown to be incorrect.
 8 | 
 9 | We believe a large contributing factor is the difficulty of performing
10 | security evaluations.
11 | %
12 | In this paper, we discuss the methodological foundations,
13 | review commonly accepted best
14 | practices, and suggest new methods for evaluating defenses to
15 | adversarial examples.
16 | %
17 | We hope that both researchers developing defenses
18 | as well as readers and reviewers who wish to understand the
19 | completeness of an evaluation
20 | consider our advice in order to avoid common pitfalls.
21 | 


--------------------------------------------------------------------------------
/iclr2019_conference.bst:
--------------------------------------------------------------------------------
   1 | %% File: `iclr2017.bst'
   2 | %% A copy of iclm2010.bst, which is a modification of `plainnl.bst' for use with natbib package 
   3 | %%
   4 | %% Copyright 2010 Hal Daum\'e III
   5 | %% Modified by J. F�rnkranz
   6 | %% - Changed labels from (X and Y, 2000) to (X & Y, 2000)
   7 | %%
   8 | %% Copyright 1993-2007 Patrick W Daly
   9 | %% Max-Planck-Institut f\"ur Sonnensystemforschung
  10 | %% Max-Planck-Str. 2
  11 | %% D-37191 Katlenburg-Lindau
  12 | %% Germany
  13 | %% E-mail: daly@mps.mpg.de
  14 | %%
  15 | %% This program can be redistributed and/or modified under the terms
  16 | %% of the LaTeX Project Public License Distributed from CTAN
  17 | %% archives in directory macros/latex/base/lppl.txt; either
  18 | %% version 1 of the License, or any later version.
  19 | %%
  20 |  % Version and source file information:
  21 |  % \ProvidesFile{icml2010.mbs}[2007/11/26 1.93 (PWD)]
  22 |  %
  23 |  % BibTeX `plainnat' family
  24 |  %   version 0.99b for BibTeX versions 0.99a or later,
  25 |  %   for LaTeX versions 2.09 and 2e.
  26 |  %
  27 |  % For use with the `natbib.sty' package; emulates the corresponding
  28 |  %   member of the `plain' family, but with author-year citations.
  29 |  %
  30 |  % With version 6.0 of `natbib.sty', it may also be used for numerical
  31 |  %   citations, while retaining the commands \citeauthor, \citefullauthor,
  32 |  %   and \citeyear to print the corresponding information.
  33 |  %
  34 |  % For version 7.0 of `natbib.sty', the KEY field replaces missing
  35 |  %   authors/editors, and the date is left blank in \bibitem.
  36 |  %
  37 |  % Includes field EID for the sequence/citation number of electronic journals
  38 |  %  which is used instead of page numbers.
  39 |  %
  40 |  % Includes fields ISBN and ISSN.
  41 |  %
  42 |  % Includes field URL for Internet addresses.
  43 |  %
  44 |  % Includes field DOI for Digital Object Idenfifiers.
  45 |  %
  46 |  % Works best with the url.sty package of Donald Arseneau.
  47 |  %
  48 |  % Works with identical authors and year are further sorted by
  49 |  %   citation key, to preserve any natural sequence.
  50 |  %
  51 | ENTRY
  52 |   { address
  53 |     author
  54 |     booktitle
  55 |     chapter
  56 |     doi
  57 |     eid
  58 |     edition
  59 |     editor
  60 |     howpublished
  61 |     institution
  62 |     isbn
  63 |     issn
  64 |     journal
  65 |     key
  66 |     month
  67 |     note
  68 |     number
  69 |     organization
  70 |     pages
  71 |     publisher
  72 |     school
  73 |     series
  74 |     title
  75 |     type
  76 |     url
  77 |     volume
  78 |     year
  79 |   }
  80 |   {}
  81 |   { label extra.label sort.label short.list }
  82 | 
  83 | INTEGERS { output.state before.all mid.sentence after.sentence after.block }
  84 | 
  85 | FUNCTION {init.state.consts}
  86 | { #0 'before.all :=
  87 |   #1 'mid.sentence :=
  88 |   #2 'after.sentence :=
  89 |   #3 'after.block :=
  90 | }
  91 | 
  92 | STRINGS { s t }
  93 | 
  94 | FUNCTION {output.nonnull}
  95 | { 's :=
  96 |   output.state mid.sentence =
  97 |     { ", " * write$ }
  98 |     { output.state after.block =
  99 |         { add.period$ write$
 100 |           newline$
 101 |           "\newblock " write$
 102 |         }
 103 |         { output.state before.all =
 104 |             'write$
 105 |             { add.period$ " " * write$ }
 106 |           if$
 107 |         }
 108 |       if$
 109 |       mid.sentence 'output.state :=
 110 |     }
 111 |   if$
 112 |   s
 113 | }
 114 | 
 115 | FUNCTION {output}
 116 | { duplicate$ empty$
 117 |     'pop$
 118 |     'output.nonnull
 119 |   if$
 120 | }
 121 | 
 122 | FUNCTION {output.check}
 123 | { 't :=
 124 |   duplicate$ empty$
 125 |     { pop$ "empty " t * " in " * cite$ * warning$ }
 126 |     'output.nonnull
 127 |   if$
 128 | }
 129 | 
 130 | FUNCTION {fin.entry}
 131 | { add.period$
 132 |   write$
 133 |   newline$
 134 | }
 135 | 
 136 | FUNCTION {new.block}
 137 | { output.state before.all =
 138 |     'skip$
 139 |     { after.block 'output.state := }
 140 |   if$
 141 | }
 142 | 
 143 | FUNCTION {new.sentence}
 144 | { output.state after.block =
 145 |     'skip$
 146 |     { output.state before.all =
 147 |         'skip$
 148 |         { after.sentence 'output.state := }
 149 |       if$
 150 |     }
 151 |   if$
 152 | }
 153 | 
 154 | FUNCTION {not}
 155 | {   { #0 }
 156 |     { #1 }
 157 |   if$
 158 | }
 159 | 
 160 | FUNCTION {and}
 161 | {   'skip$
 162 |     { pop$ #0 }
 163 |   if$
 164 | }
 165 | 
 166 | FUNCTION {or}
 167 | {   { pop$ #1 }
 168 |     'skip$
 169 |   if$
 170 | }
 171 | 
 172 | FUNCTION {new.block.checka}
 173 | { empty$
 174 |     'skip$
 175 |     'new.block
 176 |   if$
 177 | }
 178 | 
 179 | FUNCTION {new.block.checkb}
 180 | { empty$
 181 |   swap$ empty$
 182 |   and
 183 |     'skip$
 184 |     'new.block
 185 |   if$
 186 | }
 187 | 
 188 | FUNCTION {new.sentence.checka}
 189 | { empty$
 190 |     'skip$
 191 |     'new.sentence
 192 |   if$
 193 | }
 194 | 
 195 | FUNCTION {new.sentence.checkb}
 196 | { empty$
 197 |   swap$ empty$
 198 |   and
 199 |     'skip$
 200 |     'new.sentence
 201 |   if$
 202 | }
 203 | 
 204 | FUNCTION {field.or.null}
 205 | { duplicate$ empty$
 206 |     { pop$ "" }
 207 |     'skip$
 208 |   if$
 209 | }
 210 | 
 211 | FUNCTION {emphasize}
 212 | { duplicate$ empty$
 213 |     { pop$ "" }
 214 |     { "\emph{" swap$ * "}" * }
 215 |   if$
 216 | }
 217 | 
 218 | INTEGERS { nameptr namesleft numnames }
 219 | 
 220 | FUNCTION {format.names}
 221 | { 's :=
 222 |   #1 'nameptr :=
 223 |   s num.names$ 'numnames :=
 224 |   numnames 'namesleft :=
 225 |     { namesleft #0 > }
 226 |     { s nameptr "{ff~}{vv~}{ll}{, jj}" format.name$ 't :=
 227 |       nameptr #1 >
 228 |         { namesleft #1 >
 229 |             { ", " * t * }
 230 |             { numnames #2 >
 231 |                 { "," * }
 232 |                 'skip$
 233 |               if$
 234 |               t "others" =
 235 |                 { " et~al." * }
 236 |                 { " and " * t * }
 237 |               if$
 238 |             }
 239 |           if$
 240 |         }
 241 |         't
 242 |       if$
 243 |       nameptr #1 + 'nameptr :=
 244 |       namesleft #1 - 'namesleft :=
 245 |     }
 246 |   while$
 247 | }
 248 | 
 249 | FUNCTION {format.key}
 250 | { empty$
 251 |     { key field.or.null }
 252 |     { "" }
 253 |   if$
 254 | }
 255 | 
 256 | FUNCTION {format.authors}
 257 | { author empty$
 258 |     { "" }
 259 |     { author format.names }
 260 |   if$
 261 | }
 262 | 
 263 | FUNCTION {format.editors}
 264 | { editor empty$
 265 |     { "" }
 266 |     { editor format.names
 267 |       editor num.names$ #1 >
 268 |         { " (eds.)" * }
 269 |         { " (ed.)" * }
 270 |       if$
 271 |     }
 272 |   if$
 273 | }
 274 | 
 275 | FUNCTION {format.isbn}
 276 | { isbn empty$
 277 |     { "" }
 278 |     { new.block "ISBN " isbn * }
 279 |   if$
 280 | }
 281 | 
 282 | FUNCTION {format.issn}
 283 | { issn empty$
 284 |     { "" }
 285 |     { new.block "ISSN " issn * }
 286 |   if$
 287 | }
 288 | 
 289 | FUNCTION {format.url}
 290 | { url empty$
 291 |     { "" }
 292 |     { new.block "URL \url{" url * "}" * }
 293 |   if$
 294 | }
 295 | 
 296 | FUNCTION {format.doi}
 297 | { doi empty$
 298 |     { "" }
 299 |     { new.block "\doi{" doi * "}" * }
 300 |   if$
 301 | }
 302 | 
 303 | FUNCTION {format.title}
 304 | { title empty$
 305 |     { "" }
 306 |     { title "t" change.case$ }
 307 |   if$
 308 | }
 309 | 
 310 | FUNCTION {format.full.names}
 311 | {'s :=
 312 |   #1 'nameptr :=
 313 |   s num.names$ 'numnames :=
 314 |   numnames 'namesleft :=
 315 |     { namesleft #0 > }
 316 |     { s nameptr
 317 |       "{vv~}{ll}" format.name$ 't :=
 318 |       nameptr #1 >
 319 |         {
 320 |           namesleft #1 >
 321 |             { ", " * t * }
 322 |             {
 323 |               numnames #2 >
 324 |                 { "," * }
 325 |                 'skip$
 326 |               if$
 327 |               t "others" =
 328 |                 { " et~al." * }
 329 |                 { " and " * t * }
 330 |               if$
 331 |             }
 332 |           if$
 333 |         }
 334 |         't
 335 |       if$
 336 |       nameptr #1 + 'nameptr :=
 337 |       namesleft #1 - 'namesleft :=
 338 |     }
 339 |   while$
 340 | }
 341 | 
 342 | FUNCTION {author.editor.full}
 343 | { author empty$
 344 |     { editor empty$
 345 |         { "" }
 346 |         { editor format.full.names }
 347 |       if$
 348 |     }
 349 |     { author format.full.names }
 350 |   if$
 351 | }
 352 | 
 353 | FUNCTION {author.full}
 354 | { author empty$
 355 |     { "" }
 356 |     { author format.full.names }
 357 |   if$
 358 | }
 359 | 
 360 | FUNCTION {editor.full}
 361 | { editor empty$
 362 |     { "" }
 363 |     { editor format.full.names }
 364 |   if$
 365 | }
 366 | 
 367 | FUNCTION {make.full.names}
 368 | { type$ "book" =
 369 |   type$ "inbook" =
 370 |   or
 371 |     'author.editor.full
 372 |     { type$ "proceedings" =
 373 |         'editor.full
 374 |         'author.full
 375 |       if$
 376 |     }
 377 |   if$
 378 | }
 379 | 
 380 | FUNCTION {output.bibitem}
 381 | { newline$
 382 |   "\bibitem[" write$
 383 |   label write$
 384 |   ")" make.full.names duplicate$ short.list =
 385 |      { pop$ }
 386 |      { * }
 387 |    if$
 388 |   "]{" * write$
 389 |   cite$ write$
 390 |   "}" write$
 391 |   newline$
 392 |   ""
 393 |   before.all 'output.state :=
 394 | }
 395 | 
 396 | FUNCTION {n.dashify}
 397 | { 't :=
 398 |   ""
 399 |     { t empty$ not }
 400 |     { t #1 #1 substring$ "-" =
 401 |         { t #1 #2 substring$ "--" = not
 402 |             { "--" *
 403 |               t #2 global.max$ substring$ 't :=
 404 |             }
 405 |             {   { t #1 #1 substring$ "-" = }
 406 |                 { "-" *
 407 |                   t #2 global.max$ substring$ 't :=
 408 |                 }
 409 |               while$
 410 |             }
 411 |           if$
 412 |         }
 413 |         { t #1 #1 substring$ *
 414 |           t #2 global.max$ substring$ 't :=
 415 |         }
 416 |       if$
 417 |     }
 418 |   while$
 419 | }
 420 | 
 421 | FUNCTION {format.date}
 422 | { year duplicate$ empty$
 423 |     { "empty year in " cite$ * warning$
 424 |        pop$ "" }
 425 |     'skip$
 426 |   if$
 427 |   month empty$
 428 |     'skip$
 429 |     { month
 430 |       " " * swap$ *
 431 |     }
 432 |   if$
 433 |   extra.label *
 434 | }
 435 | 
 436 | FUNCTION {format.btitle}
 437 | { title emphasize
 438 | }
 439 | 
 440 | FUNCTION {tie.or.space.connect}
 441 | { duplicate$ text.length$ #3 <
 442 |     { "~" }
 443 |     { " " }
 444 |   if$
 445 |   swap$ * *
 446 | }
 447 | 
 448 | FUNCTION {either.or.check}
 449 | { empty$
 450 |     'pop$
 451 |     { "can't use both " swap$ * " fields in " * cite$ * warning$ }
 452 |   if$
 453 | }
 454 | 
 455 | FUNCTION {format.bvolume}
 456 | { volume empty$
 457 |     { "" }
 458 |     { "volume" volume tie.or.space.connect
 459 |       series empty$
 460 |         'skip$
 461 |         { " of " * series emphasize * }
 462 |       if$
 463 |       "volume and number" number either.or.check
 464 |     }
 465 |   if$
 466 | }
 467 | 
 468 | FUNCTION {format.number.series}
 469 | { volume empty$
 470 |     { number empty$
 471 |         { series field.or.null }
 472 |         { output.state mid.sentence =
 473 |             { "number" }
 474 |             { "Number" }
 475 |           if$
 476 |           number tie.or.space.connect
 477 |           series empty$
 478 |             { "there's a number but no series in " cite$ * warning$ }
 479 |             { " in " * series * }
 480 |           if$
 481 |         }
 482 |       if$
 483 |     }
 484 |     { "" }
 485 |   if$
 486 | }
 487 | 
 488 | FUNCTION {format.edition}
 489 | { edition empty$
 490 |     { "" }
 491 |     { output.state mid.sentence =
 492 |         { edition "l" change.case$ " edition" * }
 493 |         { edition "t" change.case$ " edition" * }
 494 |       if$
 495 |     }
 496 |   if$
 497 | }
 498 | 
 499 | INTEGERS { multiresult }
 500 | 
 501 | FUNCTION {multi.page.check}
 502 | { 't :=
 503 |   #0 'multiresult :=
 504 |     { multiresult not
 505 |       t empty$ not
 506 |       and
 507 |     }
 508 |     { t #1 #1 substring$
 509 |       duplicate$ "-" =
 510 |       swap$ duplicate$ "," =
 511 |       swap$ "+" =
 512 |       or or
 513 |         { #1 'multiresult := }
 514 |         { t #2 global.max$ substring$ 't := }
 515 |       if$
 516 |     }
 517 |   while$
 518 |   multiresult
 519 | }
 520 | 
 521 | FUNCTION {format.pages}
 522 | { pages empty$
 523 |     { "" }
 524 |     { pages multi.page.check
 525 |         { "pp.\ " pages n.dashify tie.or.space.connect }
 526 |         { "pp.\ " pages tie.or.space.connect }
 527 |       if$
 528 |     }
 529 |   if$
 530 | }
 531 | 
 532 | FUNCTION {format.eid}
 533 | { eid empty$
 534 |     { "" }
 535 |     { "art." eid tie.or.space.connect }
 536 |   if$
 537 | }
 538 | 
 539 | FUNCTION {format.vol.num.pages}
 540 | { volume field.or.null
 541 |   number empty$
 542 |     'skip$
 543 |     { "\penalty0 (" number * ")" * *
 544 |       volume empty$
 545 |         { "there's a number but no volume in " cite$ * warning$ }
 546 |         'skip$
 547 |       if$
 548 |     }
 549 |   if$
 550 |   pages empty$
 551 |     'skip$
 552 |     { duplicate$ empty$
 553 |         { pop$ format.pages }
 554 |         { ":\penalty0 " * pages n.dashify * }
 555 |       if$
 556 |     }
 557 |   if$
 558 | }
 559 | 
 560 | FUNCTION {format.vol.num.eid}
 561 | { volume field.or.null
 562 |   number empty$
 563 |     'skip$
 564 |     { "\penalty0 (" number * ")" * *
 565 |       volume empty$
 566 |         { "there's a number but no volume in " cite$ * warning$ }
 567 |         'skip$
 568 |       if$
 569 |     }
 570 |   if$
 571 |   eid empty$
 572 |     'skip$
 573 |     { duplicate$ empty$
 574 |         { pop$ format.eid }
 575 |         { ":\penalty0 " * eid * }
 576 |       if$
 577 |     }
 578 |   if$
 579 | }
 580 | 
 581 | FUNCTION {format.chapter.pages}
 582 | { chapter empty$
 583 |     'format.pages
 584 |     { type empty$
 585 |         { "chapter" }
 586 |         { type "l" change.case$ }
 587 |       if$
 588 |       chapter tie.or.space.connect
 589 |       pages empty$
 590 |         'skip$
 591 |         { ", " * format.pages * }
 592 |       if$
 593 |     }
 594 |   if$
 595 | }
 596 | 
 597 | FUNCTION {format.in.ed.booktitle}
 598 | { booktitle empty$
 599 |     { "" }
 600 |     { editor empty$
 601 |         { "In " booktitle emphasize * }
 602 |         { "In " format.editors * ", " * booktitle emphasize * }
 603 |       if$
 604 |     }
 605 |   if$
 606 | }
 607 | 
 608 | FUNCTION {empty.misc.check}
 609 | { author empty$ title empty$ howpublished empty$
 610 |   month empty$ year empty$ note empty$
 611 |   and and and and and
 612 |   key empty$ not and
 613 |     { "all relevant fields are empty in " cite$ * warning$ }
 614 |     'skip$
 615 |   if$
 616 | }
 617 | 
 618 | FUNCTION {format.thesis.type}
 619 | { type empty$
 620 |     'skip$
 621 |     { pop$
 622 |       type "t" change.case$
 623 |     }
 624 |   if$
 625 | }
 626 | 
 627 | FUNCTION {format.tr.number}
 628 | { type empty$
 629 |     { "Technical Report" }
 630 |     'type
 631 |   if$
 632 |   number empty$
 633 |     { "t" change.case$ }
 634 |     { number tie.or.space.connect }
 635 |   if$
 636 | }
 637 | 
 638 | FUNCTION {format.article.crossref}
 639 | { key empty$
 640 |     { journal empty$
 641 |         { "need key or journal for " cite$ * " to crossref " * crossref *
 642 |           warning$
 643 |           ""
 644 |         }
 645 |         { "In \emph{" journal * "}" * }
 646 |       if$
 647 |     }
 648 |     { "In " }
 649 |   if$
 650 |   " \citet{" * crossref * "}" *
 651 | }
 652 | 
 653 | FUNCTION {format.book.crossref}
 654 | { volume empty$
 655 |     { "empty volume in " cite$ * "'s crossref of " * crossref * warning$
 656 |       "In "
 657 |     }
 658 |     { "Volume" volume tie.or.space.connect
 659 |       " of " *
 660 |     }
 661 |   if$
 662 |   editor empty$
 663 |   editor field.or.null author field.or.null =
 664 |   or
 665 |     { key empty$
 666 |         { series empty$
 667 |             { "need editor, key, or series for " cite$ * " to crossref " *
 668 |               crossref * warning$
 669 |               "" *
 670 |             }
 671 |             { "\emph{" * series * "}" * }
 672 |           if$
 673 |         }
 674 |         'skip$
 675 |       if$
 676 |     }
 677 |     'skip$
 678 |   if$
 679 |   " \citet{" * crossref * "}" *
 680 | }
 681 | 
 682 | FUNCTION {format.incoll.inproc.crossref}
 683 | { editor empty$
 684 |   editor field.or.null author field.or.null =
 685 |   or
 686 |     { key empty$
 687 |         { booktitle empty$
 688 |             { "need editor, key, or booktitle for " cite$ * " to crossref " *
 689 |               crossref * warning$
 690 |               ""
 691 |             }
 692 |             { "In \emph{" booktitle * "}" * }
 693 |           if$
 694 |         }
 695 |         { "In " }
 696 |       if$
 697 |     }
 698 |     { "In " }
 699 |   if$
 700 |   " \citet{" * crossref * "}" *
 701 | }
 702 | 
 703 | FUNCTION {article}
 704 | { output.bibitem
 705 |   format.authors "author" output.check
 706 |   author format.key output
 707 |   new.block
 708 |   format.title "title" output.check
 709 |   new.block
 710 |   crossref missing$
 711 |     { journal emphasize "journal" output.check
 712 |       eid empty$
 713 |         { format.vol.num.pages output }
 714 |         { format.vol.num.eid output }
 715 |       if$
 716 |       format.date "year" output.check
 717 |     }
 718 |     { format.article.crossref output.nonnull
 719 |       eid empty$
 720 |         { format.pages output }
 721 |         { format.eid output }
 722 |       if$
 723 |     }
 724 |   if$
 725 |   format.issn output
 726 |   format.doi output
 727 |   format.url output
 728 |   new.block
 729 |   note output
 730 |   fin.entry
 731 | }
 732 | 
 733 | FUNCTION {book}
 734 | { output.bibitem
 735 |   author empty$
 736 |     { format.editors "author and editor" output.check
 737 |       editor format.key output
 738 |     }
 739 |     { format.authors output.nonnull
 740 |       crossref missing$
 741 |         { "author and editor" editor either.or.check }
 742 |         'skip$
 743 |       if$
 744 |     }
 745 |   if$
 746 |   new.block
 747 |   format.btitle "title" output.check
 748 |   crossref missing$
 749 |     { format.bvolume output
 750 |       new.block
 751 |       format.number.series output
 752 |       new.sentence
 753 |       publisher "publisher" output.check
 754 |       address output
 755 |     }
 756 |     { new.block
 757 |       format.book.crossref output.nonnull
 758 |     }
 759 |   if$
 760 |   format.edition output
 761 |   format.date "year" output.check
 762 |   format.isbn output
 763 |   format.doi output
 764 |   format.url output
 765 |   new.block
 766 |   note output
 767 |   fin.entry
 768 | }
 769 | 
 770 | FUNCTION {booklet}
 771 | { output.bibitem
 772 |   format.authors output
 773 |   author format.key output
 774 |   new.block
 775 |   format.title "title" output.check
 776 |   howpublished address new.block.checkb
 777 |   howpublished output
 778 |   address output
 779 |   format.date output
 780 |   format.isbn output
 781 |   format.doi output
 782 |   format.url output
 783 |   new.block
 784 |   note output
 785 |   fin.entry
 786 | }
 787 | 
 788 | FUNCTION {inbook}
 789 | { output.bibitem
 790 |   author empty$
 791 |     { format.editors "author and editor" output.check
 792 |       editor format.key output
 793 |     }
 794 |     { format.authors output.nonnull
 795 |       crossref missing$
 796 |         { "author and editor" editor either.or.check }
 797 |         'skip$
 798 |       if$
 799 |     }
 800 |   if$
 801 |   new.block
 802 |   format.btitle "title" output.check
 803 |   crossref missing$
 804 |     { format.bvolume output
 805 |       format.chapter.pages "chapter and pages" output.check
 806 |       new.block
 807 |       format.number.series output
 808 |       new.sentence
 809 |       publisher "publisher" output.check
 810 |       address output
 811 |     }
 812 |     { format.chapter.pages "chapter and pages" output.check
 813 |       new.block
 814 |       format.book.crossref output.nonnull
 815 |     }
 816 |   if$
 817 |   format.edition output
 818 |   format.date "year" output.check
 819 |   format.isbn output
 820 |   format.doi output
 821 |   format.url output
 822 |   new.block
 823 |   note output
 824 |   fin.entry
 825 | }
 826 | 
 827 | FUNCTION {incollection}
 828 | { output.bibitem
 829 |   format.authors "author" output.check
 830 |   author format.key output
 831 |   new.block
 832 |   format.title "title" output.check
 833 |   new.block
 834 |   crossref missing$
 835 |     { format.in.ed.booktitle "booktitle" output.check
 836 |       format.bvolume output
 837 |       format.number.series output
 838 |       format.chapter.pages output
 839 |       new.sentence
 840 |       publisher "publisher" output.check
 841 |       address output
 842 |       format.edition output
 843 |       format.date "year" output.check
 844 |     }
 845 |     { format.incoll.inproc.crossref output.nonnull
 846 |       format.chapter.pages output
 847 |     }
 848 |   if$
 849 |   format.isbn output
 850 |   format.doi output
 851 |   format.url output
 852 |   new.block
 853 |   note output
 854 |   fin.entry
 855 | }
 856 | 
 857 | FUNCTION {inproceedings}
 858 | { output.bibitem
 859 |   format.authors "author" output.check
 860 |   author format.key output
 861 |   new.block
 862 |   format.title "title" output.check
 863 |   new.block
 864 |   crossref missing$
 865 |     { format.in.ed.booktitle "booktitle" output.check
 866 |       format.bvolume output
 867 |       format.number.series output
 868 |       format.pages output
 869 |       address empty$
 870 |         { organization publisher new.sentence.checkb
 871 |           organization output
 872 |           publisher output
 873 |           format.date "year" output.check
 874 |         }
 875 |         { address output.nonnull
 876 |           format.date "year" output.check
 877 |           new.sentence
 878 |           organization output
 879 |           publisher output
 880 |         }
 881 |       if$
 882 |     }
 883 |     { format.incoll.inproc.crossref output.nonnull
 884 |       format.pages output
 885 |     }
 886 |   if$
 887 |   format.isbn output
 888 |   format.doi output
 889 |   format.url output
 890 |   new.block
 891 |   note output
 892 |   fin.entry
 893 | }
 894 | 
 895 | FUNCTION {conference} { inproceedings }
 896 | 
 897 | FUNCTION {manual}
 898 | { output.bibitem
 899 |   format.authors output
 900 |   author format.key output
 901 |   new.block
 902 |   format.btitle "title" output.check
 903 |   organization address new.block.checkb
 904 |   organization output
 905 |   address output
 906 |   format.edition output
 907 |   format.date output
 908 |   format.url output
 909 |   new.block
 910 |   note output
 911 |   fin.entry
 912 | }
 913 | 
 914 | FUNCTION {mastersthesis}
 915 | { output.bibitem
 916 |   format.authors "author" output.check
 917 |   author format.key output
 918 |   new.block
 919 |   format.title "title" output.check
 920 |   new.block
 921 |   "Master's thesis" format.thesis.type output.nonnull
 922 |   school "school" output.check
 923 |   address output
 924 |   format.date "year" output.check
 925 |   format.url output
 926 |   new.block
 927 |   note output
 928 |   fin.entry
 929 | }
 930 | 
 931 | FUNCTION {misc}
 932 | { output.bibitem
 933 |   format.authors output
 934 |   author format.key output
 935 |   title howpublished new.block.checkb
 936 |   format.title output
 937 |   howpublished new.block.checka
 938 |   howpublished output
 939 |   format.date output
 940 |   format.issn output
 941 |   format.url output
 942 |   new.block
 943 |   note output
 944 |   fin.entry
 945 |   empty.misc.check
 946 | }
 947 | 
 948 | FUNCTION {phdthesis}
 949 | { output.bibitem
 950 |   format.authors "author" output.check
 951 |   author format.key output
 952 |   new.block
 953 |   format.btitle "title" output.check
 954 |   new.block
 955 |   "PhD thesis" format.thesis.type output.nonnull
 956 |   school "school" output.check
 957 |   address output
 958 |   format.date "year" output.check
 959 |   format.url output
 960 |   new.block
 961 |   note output
 962 |   fin.entry
 963 | }
 964 | 
 965 | FUNCTION {proceedings}
 966 | { output.bibitem
 967 |   format.editors output
 968 |   editor format.key output
 969 |   new.block
 970 |   format.btitle "title" output.check
 971 |   format.bvolume output
 972 |   format.number.series output
 973 |   address output
 974 |   format.date "year" output.check
 975 |   new.sentence
 976 |   organization output
 977 |   publisher output
 978 |   format.isbn output
 979 |   format.doi output
 980 |   format.url output
 981 |   new.block
 982 |   note output
 983 |   fin.entry
 984 | }
 985 | 
 986 | FUNCTION {techreport}
 987 | { output.bibitem
 988 |   format.authors "author" output.check
 989 |   author format.key output
 990 |   new.block
 991 |   format.title "title" output.check
 992 |   new.block
 993 |   format.tr.number output.nonnull
 994 |   institution "institution" output.check
 995 |   address output
 996 |   format.date "year" output.check
 997 |   format.url output
 998 |   new.block
 999 |   note output
1000 |   fin.entry
1001 | }
1002 | 
1003 | FUNCTION {unpublished}
1004 | { output.bibitem
1005 |   format.authors "author" output.check
1006 |   author format.key output
1007 |   new.block
1008 |   format.title "title" output.check
1009 |   new.block
1010 |   note "note" output.check
1011 |   format.date output
1012 |   format.url output
1013 |   fin.entry
1014 | }
1015 | 
1016 | FUNCTION {default.type} { misc }
1017 | 
1018 | 
1019 | MACRO {jan} {"January"}
1020 | 
1021 | MACRO {feb} {"February"}
1022 | 
1023 | MACRO {mar} {"March"}
1024 | 
1025 | MACRO {apr} {"April"}
1026 | 
1027 | MACRO {may} {"May"}
1028 | 
1029 | MACRO {jun} {"June"}
1030 | 
1031 | MACRO {jul} {"July"}
1032 | 
1033 | MACRO {aug} {"August"}
1034 | 
1035 | MACRO {sep} {"September"}
1036 | 
1037 | MACRO {oct} {"October"}
1038 | 
1039 | MACRO {nov} {"November"}
1040 | 
1041 | MACRO {dec} {"December"}
1042 | 
1043 | 
1044 | 
1045 | MACRO {acmcs} {"ACM Computing Surveys"}
1046 | 
1047 | MACRO {acta} {"Acta Informatica"}
1048 | 
1049 | MACRO {cacm} {"Communications of the ACM"}
1050 | 
1051 | MACRO {ibmjrd} {"IBM Journal of Research and Development"}
1052 | 
1053 | MACRO {ibmsj} {"IBM Systems Journal"}
1054 | 
1055 | MACRO {ieeese} {"IEEE Transactions on Software Engineering"}
1056 | 
1057 | MACRO {ieeetc} {"IEEE Transactions on Computers"}
1058 | 
1059 | MACRO {ieeetcad}
1060 |  {"IEEE Transactions on Computer-Aided Design of Integrated Circuits"}
1061 | 
1062 | MACRO {ipl} {"Information Processing Letters"}
1063 | 
1064 | MACRO {jacm} {"Journal of the ACM"}
1065 | 
1066 | MACRO {jcss} {"Journal of Computer and System Sciences"}
1067 | 
1068 | MACRO {scp} {"Science of Computer Programming"}
1069 | 
1070 | MACRO {sicomp} {"SIAM Journal on Computing"}
1071 | 
1072 | MACRO {tocs} {"ACM Transactions on Computer Systems"}
1073 | 
1074 | MACRO {tods} {"ACM Transactions on Database Systems"}
1075 | 
1076 | MACRO {tog} {"ACM Transactions on Graphics"}
1077 | 
1078 | MACRO {toms} {"ACM Transactions on Mathematical Software"}
1079 | 
1080 | MACRO {toois} {"ACM Transactions on Office Information Systems"}
1081 | 
1082 | MACRO {toplas} {"ACM Transactions on Programming Languages and Systems"}
1083 | 
1084 | MACRO {tcs} {"Theoretical Computer Science"}
1085 | 
1086 | 
1087 | READ
1088 | 
1089 | FUNCTION {sortify}
1090 | { purify$
1091 |   "l" change.case$
1092 | }
1093 | 
1094 | INTEGERS { len }
1095 | 
1096 | FUNCTION {chop.word}
1097 | { 's :=
1098 |   'len :=
1099 |   s #1 len substring$ =
1100 |     { s len #1 + global.max$ substring$ }
1101 |     's
1102 |   if$
1103 | }
1104 | 
1105 | FUNCTION {format.lab.names}
1106 | { 's :=
1107 |   s #1 "{vv~}{ll}" format.name$
1108 |   s num.names$ duplicate$
1109 |   #2 >
1110 |     { pop$ " et~al." * }
1111 |     { #2 <
1112 |         'skip$
1113 |         { s #2 "{ff }{vv }{ll}{ jj}" format.name$ "others" =
1114 |             { " et~al." * }
1115 |             { " \& " * s #2 "{vv~}{ll}" format.name$ * }
1116 |           if$
1117 |         }
1118 |       if$
1119 |     }
1120 |   if$
1121 | }
1122 | 
1123 | FUNCTION {author.key.label}
1124 | { author empty$
1125 |     { key empty$
1126 |         { cite$ #1 #3 substring$ }
1127 |         'key
1128 |       if$
1129 |     }
1130 |     { author format.lab.names }
1131 |   if$
1132 | }
1133 | 
1134 | FUNCTION {author.editor.key.label}
1135 | { author empty$
1136 |     { editor empty$
1137 |         { key empty$
1138 |             { cite$ #1 #3 substring$ }
1139 |             'key
1140 |           if$
1141 |         }
1142 |         { editor format.lab.names }
1143 |       if$
1144 |     }
1145 |     { author format.lab.names }
1146 |   if$
1147 | }
1148 | 
1149 | FUNCTION {author.key.organization.label}
1150 | { author empty$
1151 |     { key empty$
1152 |         { organization empty$
1153 |             { cite$ #1 #3 substring$ }
1154 |             { "The " #4 organization chop.word #3 text.prefix$ }
1155 |           if$
1156 |         }
1157 |         'key
1158 |       if$
1159 |     }
1160 |     { author format.lab.names }
1161 |   if$
1162 | }
1163 | 
1164 | FUNCTION {editor.key.organization.label}
1165 | { editor empty$
1166 |     { key empty$
1167 |         { organization empty$
1168 |             { cite$ #1 #3 substring$ }
1169 |             { "The " #4 organization chop.word #3 text.prefix$ }
1170 |           if$
1171 |         }
1172 |         'key
1173 |       if$
1174 |     }
1175 |     { editor format.lab.names }
1176 |   if$
1177 | }
1178 | 
1179 | FUNCTION {calc.short.authors}
1180 | { type$ "book" =
1181 |   type$ "inbook" =
1182 |   or
1183 |     'author.editor.key.label
1184 |     { type$ "proceedings" =
1185 |         'editor.key.organization.label
1186 |         { type$ "manual" =
1187 |             'author.key.organization.label
1188 |             'author.key.label
1189 |           if$
1190 |         }
1191 |       if$
1192 |     }
1193 |   if$
1194 |   'short.list :=
1195 | }
1196 | 
1197 | FUNCTION {calc.label}
1198 | { calc.short.authors
1199 |   short.list
1200 |   "("
1201 |   *
1202 |   year duplicate$ empty$
1203 |   short.list key field.or.null = or
1204 |      { pop$ "" }
1205 |      'skip$
1206 |   if$
1207 |   *
1208 |   'label :=
1209 | }
1210 | 
1211 | FUNCTION {sort.format.names}
1212 | { 's :=
1213 |   #1 'nameptr :=
1214 |   ""
1215 |   s num.names$ 'numnames :=
1216 |   numnames 'namesleft :=
1217 |     { namesleft #0 > }
1218 |     {
1219 |       s nameptr "{vv{ } }{ll{ }}{  ff{ }}{  jj{ }}" format.name$ 't :=
1220 |       nameptr #1 >
1221 |         {
1222 |           "   "  *
1223 |           namesleft #1 = t "others" = and
1224 |             { "zzzzz" * }
1225 |             { numnames #2 > nameptr #2 = and
1226 |                 { "zz" * year field.or.null * "   " * }
1227 |                 'skip$
1228 |               if$
1229 |               t sortify *
1230 |             }
1231 |           if$
1232 |         }
1233 |         { t sortify * }
1234 |       if$
1235 |       nameptr #1 + 'nameptr :=
1236 |       namesleft #1 - 'namesleft :=
1237 |     }
1238 |   while$
1239 | }
1240 | 
1241 | FUNCTION {sort.format.title}
1242 | { 't :=
1243 |   "A " #2
1244 |     "An " #3
1245 |       "The " #4 t chop.word
1246 |     chop.word
1247 |   chop.word
1248 |   sortify
1249 |   #1 global.max$ substring$
1250 | }
1251 | 
1252 | FUNCTION {author.sort}
1253 | { author empty$
1254 |     { key empty$
1255 |         { "to sort, need author or key in " cite$ * warning$
1256 |           ""
1257 |         }
1258 |         { key sortify }
1259 |       if$
1260 |     }
1261 |     { author sort.format.names }
1262 |   if$
1263 | }
1264 | 
1265 | FUNCTION {author.editor.sort}
1266 | { author empty$
1267 |     { editor empty$
1268 |         { key empty$
1269 |             { "to sort, need author, editor, or key in " cite$ * warning$
1270 |               ""
1271 |             }
1272 |             { key sortify }
1273 |           if$
1274 |         }
1275 |         { editor sort.format.names }
1276 |       if$
1277 |     }
1278 |     { author sort.format.names }
1279 |   if$
1280 | }
1281 | 
1282 | FUNCTION {author.organization.sort}
1283 | { author empty$
1284 |     { organization empty$
1285 |         { key empty$
1286 |             { "to sort, need author, organization, or key in " cite$ * warning$
1287 |               ""
1288 |             }
1289 |             { key sortify }
1290 |           if$
1291 |         }
1292 |         { "The " #4 organization chop.word sortify }
1293 |       if$
1294 |     }
1295 |     { author sort.format.names }
1296 |   if$
1297 | }
1298 | 
1299 | FUNCTION {editor.organization.sort}
1300 | { editor empty$
1301 |     { organization empty$
1302 |         { key empty$
1303 |             { "to sort, need editor, organization, or key in " cite$ * warning$
1304 |               ""
1305 |             }
1306 |             { key sortify }
1307 |           if$
1308 |         }
1309 |         { "The " #4 organization chop.word sortify }
1310 |       if$
1311 |     }
1312 |     { editor sort.format.names }
1313 |   if$
1314 | }
1315 | 
1316 | 
1317 | FUNCTION {presort}
1318 | { calc.label
1319 |   label sortify
1320 |   "    "
1321 |   *
1322 |   type$ "book" =
1323 |   type$ "inbook" =
1324 |   or
1325 |     'author.editor.sort
1326 |     { type$ "proceedings" =
1327 |         'editor.organization.sort
1328 |         { type$ "manual" =
1329 |             'author.organization.sort
1330 |             'author.sort
1331 |           if$
1332 |         }
1333 |       if$
1334 |     }
1335 |   if$
1336 |   "    "
1337 |   *
1338 |   year field.or.null sortify
1339 |   *
1340 |   "    "
1341 |   *
1342 |   cite$
1343 |   *
1344 |   #1 entry.max$ substring$
1345 |   'sort.label :=
1346 |   sort.label *
1347 |   #1 entry.max$ substring$
1348 |   'sort.key$ :=
1349 | }
1350 | 
1351 | ITERATE {presort}
1352 | 
1353 | SORT
1354 | 
1355 | STRINGS { longest.label last.label next.extra }
1356 | 
1357 | INTEGERS { longest.label.width last.extra.num number.label }
1358 | 
1359 | FUNCTION {initialize.longest.label}
1360 | { "" 'longest.label :=
1361 |   #0 int.to.chr$ 'last.label :=
1362 |   "" 'next.extra :=
1363 |   #0 'longest.label.width :=
1364 |   #0 'last.extra.num :=
1365 |   #0 'number.label :=
1366 | }
1367 | 
1368 | FUNCTION {forward.pass}
1369 | { last.label label =
1370 |     { last.extra.num #1 + 'last.extra.num :=
1371 |       last.extra.num int.to.chr$ 'extra.label :=
1372 |     }
1373 |     { "a" chr.to.int$ 'last.extra.num :=
1374 |       "" 'extra.label :=
1375 |       label 'last.label :=
1376 |     }
1377 |   if$
1378 |   number.label #1 + 'number.label :=
1379 | }
1380 | 
1381 | FUNCTION {reverse.pass}
1382 | { next.extra "b" =
1383 |     { "a" 'extra.label := }
1384 |     'skip$
1385 |   if$
1386 |   extra.label 'next.extra :=
1387 |   extra.label
1388 |   duplicate$ empty$
1389 |     'skip$
1390 |     { "{\natexlab{" swap$ * "}}" * }
1391 |   if$
1392 |   'extra.label :=
1393 |   label extra.label * 'label :=
1394 | }
1395 | 
1396 | EXECUTE {initialize.longest.label}
1397 | 
1398 | ITERATE {forward.pass}
1399 | 
1400 | REVERSE {reverse.pass}
1401 | 
1402 | FUNCTION {bib.sort.order}
1403 | { sort.label  'sort.key$ :=
1404 | }
1405 | 
1406 | ITERATE {bib.sort.order}
1407 | 
1408 | SORT
1409 | 
1410 | FUNCTION {begin.bib}
1411 | {   preamble$ empty$
1412 |     'skip$
1413 |     { preamble$ write$ newline$ }
1414 |   if$
1415 |   "\begin{thebibliography}{" number.label int.to.str$ * "}" *
1416 |   write$ newline$
1417 |   "\providecommand{\natexlab}[1]{#1}"
1418 |   write$ newline$
1419 |   "\providecommand{\url}[1]{\texttt{#1}}"
1420 |   write$ newline$
1421 |   "\expandafter\ifx\csname urlstyle\endcsname\relax"
1422 |   write$ newline$
1423 |   "  \providecommand{\doi}[1]{doi: #1}\else"
1424 |   write$ newline$
1425 |   "  \providecommand{\doi}{doi: \begingroup \urlstyle{rm}\Url}\fi"
1426 |   write$ newline$
1427 | }
1428 | 
1429 | EXECUTE {begin.bib}
1430 | 
1431 | EXECUTE {init.state.consts}
1432 | 
1433 | ITERATE {call.type$}
1434 | 
1435 | FUNCTION {end.bib}
1436 | { newline$
1437 |   "\end{thebibliography}" write$ newline$
1438 | }
1439 | 
1440 | EXECUTE {end.bib}
1441 | 


--------------------------------------------------------------------------------
/iclr2019_conference.sty:
--------------------------------------------------------------------------------
  1 | %%%% ICLR Macros (LaTex)
  2 | %%%% Adapted by Hugo Larochelle from the NIPS stylefile Macros
  3 | %%%% Style File
  4 | %%%% Dec 12, 1990   Rev Aug 14, 1991; Sept, 1995; April, 1997; April, 1999; October 2014
  5 | 
  6 | % This file can be used with Latex2e whether running in main mode, or
  7 | % 2.09 compatibility mode.
  8 | %
  9 | % If using main mode, you need to include the commands
 10 | %             \documentclass{article}
 11 | %             \usepackage{iclr14submit_e,times}
 12 | %
 13 | 
 14 | % Change the overall width of the page.  If these parameters are
 15 | %       changed, they will require corresponding changes in the
 16 | %       maketitle section.
 17 | %
 18 | \usepackage{eso-pic} % used by \AddToShipoutPicture
 19 | \RequirePackage{fancyhdr}
 20 | \RequirePackage{natbib}
 21 | 
 22 | % modification to natbib citations
 23 | \setcitestyle{authoryear,round,citesep={;},aysep={,},yysep={;}}
 24 | 
 25 | \renewcommand{\topfraction}{0.95}   % let figure take up nearly whole page
 26 | \renewcommand{\textfraction}{0.05}  % let figure take up nearly whole page
 27 | 
 28 | % Define iclrfinal, set to true if iclrfinalcopy is defined
 29 | \newif\ificlrfinal
 30 | \iclrfinalfalse
 31 | \def\iclrfinalcopy{\iclrfinaltrue}
 32 | \font\iclrtenhv  = phvb at 8pt
 33 | 
 34 | % Specify the dimensions of each page
 35 | 
 36 | \setlength{\paperheight}{11in}
 37 | \setlength{\paperwidth}{8.5in}
 38 | 
 39 | 
 40 | \oddsidemargin .5in    %   Note \oddsidemargin = \evensidemargin
 41 | \evensidemargin .5in
 42 | \marginparwidth 0.07 true in
 43 | %\marginparwidth 0.75 true in
 44 | %\topmargin 0 true pt           % Nominal distance from top of page to top of
 45 | %\topmargin 0.125in
 46 | \topmargin -0.625in
 47 | \addtolength{\headsep}{0.25in}
 48 | \textheight 9.0 true in       % Height of text (including footnotes & figures)
 49 | \textwidth 5.5 true in        % Width of text line.
 50 | \widowpenalty=10000
 51 | \clubpenalty=10000
 52 | 
 53 | % \thispagestyle{empty}        \pagestyle{empty}
 54 | \flushbottom \sloppy
 55 | 
 56 | % We're never going to need a table of contents, so just flush it to
 57 | % save space --- suggested by drstrip@sandia-2
 58 | \def\addcontentsline#1#2#3{}
 59 | 
 60 | % Title stuff, taken from deproc.
 61 | \def\maketitle{\par
 62 | \begingroup
 63 |    \def\thefootnote{\fnsymbol{footnote}}
 64 |    \def\@makefnmark{\hbox to 0pt{$^{\@thefnmark}$\hss}} % for perfect author
 65 |                                                         % name centering
 66 | %   The footnote-mark was overlapping the footnote-text,
 67 | %   added the following to fix this problem               (MK)
 68 |    \long\def\@makefntext##1{\parindent 1em\noindent
 69 |                             \hbox to1.8em{\hss $\m@th ^{\@thefnmark}$}##1}
 70 |    \@maketitle \@thanks
 71 | \endgroup
 72 | \setcounter{footnote}{0}
 73 | \let\maketitle\relax \let\@maketitle\relax
 74 | \gdef\@thanks{}\gdef\@author{}\gdef\@title{}\let\thanks\relax}
 75 | 
 76 | % The toptitlebar has been raised to top-justify the first page
 77 | 
 78 | \usepackage{fancyhdr}
 79 | \pagestyle{fancy}
 80 | \fancyhead{}
 81 | 
 82 | % Title (includes both anonimized and non-anonimized versions)
 83 | \def\@maketitle{\vbox{\hsize\textwidth
 84 | %\linewidth\hsize \vskip 0.1in \toptitlebar \centering
 85 | {\LARGE\sc \@title\par}
 86 | %\bottomtitlebar % \vskip 0.1in %  minus
 87 | \ificlrfinal
 88 |     %\lhead{Published as a conference paper at ICLR 2019}
 89 |     \def\And{\end{tabular}\hfil\linebreak[0]\hfil
 90 |             \begin{tabular}[t]{l}\bf\rule{\z@}{24pt}\ignorespaces}%
 91 |   \def\AND{\end{tabular}\hfil\linebreak[4]\hfil
 92 |             \begin{tabular}[t]{l}\bf\rule{\z@}{24pt}\ignorespaces}%
 93 |     \begin{tabular}[t]{l}\bf\rule{\z@}{24pt}\@author\end{tabular}%
 94 | \else
 95 |        \lhead{Under review as a conference paper at ICLR 2019}
 96 |    \def\And{\end{tabular}\hfil\linebreak[0]\hfil
 97 |             \begin{tabular}[t]{l}\bf\rule{\z@}{24pt}\ignorespaces}%
 98 |   \def\AND{\end{tabular}\hfil\linebreak[4]\hfil
 99 |             \begin{tabular}[t]{l}\bf\rule{\z@}{24pt}\ignorespaces}%
100 |     \begin{tabular}[t]{l}\bf\rule{\z@}{24pt}Anonymous authors\\Paper under double-blind review\end{tabular}%
101 | \fi
102 | \vskip 0.3in minus 0.1in}}
103 | 
104 | \renewenvironment{abstract}{\vskip.075in\centerline{\large\sc
105 | Abstract}\vspace{0.5ex}\begin{quote}}{\par\end{quote}\vskip 1ex}
106 | 
107 | % sections with less space
108 | \def\section{\@startsection {section}{1}{\z@}{-2.0ex plus
109 |     -0.5ex minus -.2ex}{1.5ex plus 0.3ex
110 | minus0.2ex}{\large\sc\raggedright}}
111 | 
112 | \def\subsection{\@startsection{subsection}{2}{\z@}{-1.8ex plus
113 | -0.5ex minus -.2ex}{0.8ex plus .2ex}{\normalsize\sc\raggedright}}
114 | \def\subsubsection{\@startsection{subsubsection}{3}{\z@}{-1.5ex
115 | plus      -0.5ex minus -.2ex}{0.5ex plus
116 | .2ex}{\normalsize\sc\raggedright}}
117 | \def\paragraph{\@startsection{paragraph}{4}{\z@}{1.5ex plus
118 | 0.5ex minus .2ex}{-1em}{\normalsize\bf}}
119 | \def\subparagraph{\@startsection{subparagraph}{5}{\z@}{1.5ex plus
120 |   0.5ex minus .2ex}{-1em}{\normalsize\sc}}
121 | \def\subsubsubsection{\vskip
122 | 5pt{\noindent\normalsize\rm\raggedright}}
123 | 
124 | 
125 | % Footnotes
126 | \footnotesep 6.65pt %
127 | \skip\footins 9pt plus 4pt minus 2pt
128 | \def\footnoterule{\kern-3pt \hrule width 12pc \kern 2.6pt }
129 | \setcounter{footnote}{0}
130 | 
131 | % Lists and paragraphs
132 | \parindent 0pt
133 | \topsep 4pt plus 1pt minus 2pt
134 | \partopsep 1pt plus 0.5pt minus 0.5pt
135 | \itemsep 2pt plus 1pt minus 0.5pt
136 | \parsep 2pt plus 1pt minus 0.5pt
137 | \parskip .5pc
138 | 
139 | 
140 | %\leftmargin2em
141 | \leftmargin3pc
142 | \leftmargini\leftmargin \leftmarginii 2em
143 | \leftmarginiii 1.5em \leftmarginiv 1.0em \leftmarginv .5em
144 | 
145 | %\labelsep \labelsep 5pt
146 | 
147 | \def\@listi{\leftmargin\leftmargini}
148 | \def\@listii{\leftmargin\leftmarginii
149 |    \labelwidth\leftmarginii\advance\labelwidth-\labelsep
150 |    \topsep 2pt plus 1pt minus 0.5pt
151 |    \parsep 1pt plus 0.5pt minus 0.5pt
152 |    \itemsep \parsep}
153 | \def\@listiii{\leftmargin\leftmarginiii
154 |     \labelwidth\leftmarginiii\advance\labelwidth-\labelsep
155 |     \topsep 1pt plus 0.5pt minus 0.5pt
156 |     \parsep \z@ \partopsep 0.5pt plus 0pt minus 0.5pt
157 |     \itemsep \topsep}
158 | \def\@listiv{\leftmargin\leftmarginiv
159 |      \labelwidth\leftmarginiv\advance\labelwidth-\labelsep}
160 | \def\@listv{\leftmargin\leftmarginv
161 |      \labelwidth\leftmarginv\advance\labelwidth-\labelsep}
162 | \def\@listvi{\leftmargin\leftmarginvi
163 |      \labelwidth\leftmarginvi\advance\labelwidth-\labelsep}
164 | 
165 | \abovedisplayskip 7pt plus2pt minus5pt%
166 | \belowdisplayskip \abovedisplayskip
167 | \abovedisplayshortskip  0pt plus3pt%
168 | \belowdisplayshortskip  4pt plus3pt minus3pt%
169 | 
170 | % Less leading in most fonts (due to the narrow columns)
171 | % The choices were between 1-pt and 1.5-pt leading
172 | %\def\@normalsize{\@setsize\normalsize{11pt}\xpt\@xpt} % got rid of @ (MK)
173 | \def\normalsize{\@setsize\normalsize{11pt}\xpt\@xpt}
174 | \def\small{\@setsize\small{10pt}\ixpt\@ixpt}
175 | \def\footnotesize{\@setsize\footnotesize{10pt}\ixpt\@ixpt}
176 | \def\scriptsize{\@setsize\scriptsize{8pt}\viipt\@viipt}
177 | \def\tiny{\@setsize\tiny{7pt}\vipt\@vipt}
178 | \def\large{\@setsize\large{14pt}\xiipt\@xiipt}
179 | \def\Large{\@setsize\Large{16pt}\xivpt\@xivpt}
180 | \def\LARGE{\@setsize\LARGE{20pt}\xviipt\@xviipt}
181 | \def\huge{\@setsize\huge{23pt}\xxpt\@xxpt}
182 | \def\Huge{\@setsize\Huge{28pt}\xxvpt\@xxvpt}
183 | 
184 | \def\toptitlebar{\hrule height4pt\vskip .25in\vskip-\parskip}
185 | 
186 | \def\bottomtitlebar{\vskip .29in\vskip-\parskip\hrule height1pt\vskip
187 | .09in} %
188 | %Reduced second vskip to compensate for adding the strut in \@author
189 | 
190 | 
191 | %% % Vertical Ruler
192 | %% % This code is, largely, from the CVPR 2010 conference style file
193 | %% % ----- define vruler
194 | %% \makeatletter
195 | %% \newbox\iclrrulerbox
196 | %% \newcount\iclrrulercount
197 | %% \newdimen\iclrruleroffset
198 | %% \newdimen\cv@lineheight
199 | %% \newdimen\cv@boxheight
200 | %% \newbox\cv@tmpbox
201 | %% \newcount\cv@refno
202 | %% \newcount\cv@tot
203 | %% % NUMBER with left flushed zeros  \fillzeros[<WIDTH>]<NUMBER>
204 | %% \newcount\cv@tmpc@ \newcount\cv@tmpc
205 | %% \def\fillzeros[#1]#2{\cv@tmpc@=#2\relax\ifnum\cv@tmpc@<0\cv@tmpc@=-\cv@tmpc@\fi
206 | %% \cv@tmpc=1 %
207 | %% \loop\ifnum\cv@tmpc@<10 \else \divide\cv@tmpc@ by 10 \advance\cv@tmpc by 1 \fi
208 | %%    \ifnum\cv@tmpc@=10\relax\cv@tmpc@=11\relax\fi \ifnum\cv@tmpc@>10 \repeat
209 | %% \ifnum#2<0\advance\cv@tmpc1\relax-\fi
210 | %% \loop\ifnum\cv@tmpc<#1\relax0\advance\cv@tmpc1\relax\fi \ifnum\cv@tmpc<#1 \repeat
211 | %% \cv@tmpc@=#2\relax\ifnum\cv@tmpc@<0\cv@tmpc@=-\cv@tmpc@\fi \relax\the\cv@tmpc@}%
212 | %% % \makevruler[<SCALE>][<INITIAL_COUNT>][<STEP>][<DIGITS>][<HEIGHT>]
213 | %% \def\makevruler[#1][#2][#3][#4][#5]{\begingroup\offinterlineskip
214 | %% \textheight=#5\vbadness=10000\vfuzz=120ex\overfullrule=0pt%
215 | %% \global\setbox\iclrrulerbox=\vbox to \textheight{%
216 | %% {\parskip=0pt\hfuzz=150em\cv@boxheight=\textheight
217 | %% \cv@lineheight=#1\global\iclrrulercount=#2%
218 | %% \cv@tot\cv@boxheight\divide\cv@tot\cv@lineheight\advance\cv@tot2%
219 | %% \cv@refno1\vskip-\cv@lineheight\vskip1ex%
220 | %% \loop\setbox\cv@tmpbox=\hbox to0cm{{\iclrtenhv\hfil\fillzeros[#4]\iclrrulercount}}%
221 | %% \ht\cv@tmpbox\cv@lineheight\dp\cv@tmpbox0pt\box\cv@tmpbox\break
222 | %% \advance\cv@refno1\global\advance\iclrrulercount#3\relax
223 | %% \ifnum\cv@refno<\cv@tot\repeat}}\endgroup}%
224 | %% \makeatother
225 | %% % ----- end of vruler
226 | 
227 | %% % \makevruler[<SCALE>][<INITIAL_COUNT>][<STEP>][<DIGITS>][<HEIGHT>]
228 | %% \def\iclrruler#1{\makevruler[12pt][#1][1][3][0.993\textheight]\usebox{\iclrrulerbox}}
229 | %% \AddToShipoutPicture{%
230 | %% \ificlrfinal\else
231 | %% \iclrruleroffset=\textheight
232 | %% \advance\iclrruleroffset by -3.7pt
233 | %%   \color[rgb]{.7,.7,.7}
234 | %%   \AtTextUpperLeft{%
235 | %%     \put(\LenToUnit{-35pt},\LenToUnit{-\iclrruleroffset}){%left ruler
236 | %%       \iclrruler{\iclrrulercount}}
237 | %%   }
238 | %% \fi
239 | %% }
240 | %%% To add a vertical bar on the side
241 | %\AddToShipoutPicture{
242 | %\AtTextLowerLeft{
243 | %\hspace*{-1.8cm}
244 | %\colorbox[rgb]{0.7,0.7,0.7}{\small \parbox[b][\textheight]{0.1cm}{}}}
245 | %}
246 | 


--------------------------------------------------------------------------------
/paper.bib:
--------------------------------------------------------------------------------
  1 | @inproceedings{dalvi2004adversarial,
  2 |   title={Adversarial classification},
  3 |   author={Dalvi, Nilesh and Domingos, Pedro and Sanghai, Sumit and Verma, Deepak and others},
  4 |   booktitle={Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining},
  5 |   pages={99--108},
  6 |   year={2004},
  7 |   organization={ACM}
  8 | }
  9 | 
 10 | @inproceedings{lowd2005adversarial,
 11 |   title={Adversarial learning},
 12 |   author={Lowd, Daniel and Meek, Christopher},
 13 |   booktitle={Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining},
 14 |   pages={641--647},
 15 |   year={2005},
 16 |   organization={ACM}
 17 | }
 18 | 
 19 | @inproceedings{barreno2006can,
 20 |   title={Can machine learning be secure?},
 21 |   author={Barreno, Marco and Nelson, Blaine and Sears, Russell and Joseph, Anthony D and Tygar, J Doug},
 22 |   booktitle={Proceedings of the 2006 ACM Symposium on Information, computer and communications security},
 23 |   pages={16--25},
 24 |   year={2006},
 25 |   organization={ACM}
 26 | }
 27 | 
 28 | @inproceedings{globerson2006nightmare,
 29 |   title={Nightmare at test time: robust learning by feature deletion},
 30 |   author={Globerson, Amir and Roweis, Sam},
 31 |   booktitle={Proceedings of the 23rd international conference on Machine learning},
 32 |   pages={353--360},
 33 |   year={2006},
 34 |   organization={ACM}
 35 | }
 36 | 
 37 | @inproceedings{kolcz2009feature,
 38 |   title={Feature weighting for improved classifier robustness},
 39 |   author={Ko{\l}cz, Aleksander and Teo, Choon Hui},
 40 |   booktitle={CEAS’09: sixth conference on email and anti-spam},
 41 |   year={2009}
 42 | }
 43 | 
 44 | @article{barreno2010security,
 45 |   title={The security of machine learning},
 46 |   author={Barreno, Marco and Nelson, Blaine and Joseph, Anthony D and Tygar, JD},
 47 |   journal={Machine Learning},
 48 |   volume={81},
 49 |   number={2},
 50 |   pages={121--148},
 51 |   year={2010},
 52 |   publisher={Springer}
 53 | }
 54 | 
 55 | @article{biggio2010multiple,
 56 |   title={Multiple classifier systems for robust classifier design in adversarial environments},
 57 |   author={Biggio, Battista and Fumera, Giorgio and Roli, Fabio},
 58 |   journal={International Journal of Machine Learning and Cybernetics},
 59 |   volume={1},
 60 |   number={1-4},
 61 |   pages={27--41},
 62 |   year={2010},
 63 |   publisher={Springer}
 64 | }
 65 | 
 66 | @inproceedings{vsrndic2013detection,
 67 |   title={Detection of malicious pdf files based on hierarchical document structure},
 68 |   author={{\v{S}}rndic, Nedim and Laskov, Pavel},
 69 |   booktitle={Proceedings of the 20th Annual Network \& Distributed System Security Symposium},
 70 |   pages={1--16},
 71 |   year={2013},
 72 |   organization={Citeseer}
 73 | }
 74 | 
 75 | @article{silver2016mastering,
 76 |   title={Mastering the game of Go with deep neural networks and tree search},
 77 |   author={Silver, David and Huang, Aja and Maddison, Chris J and Guez, Arthur and Sifre, Laurent and Van Den Driessche, George and Schrittwieser, Julian and Antonoglou, Ioannis and Panneershelvam, Veda and Lanctot, Marc and others},
 78 |   journal={nature},
 79 |   volume={529},
 80 |   number={7587},
 81 |   pages={484},
 82 |   year={2016},
 83 |   publisher={Nature Publishing Group}
 84 | }
 85 | 
 86 | @inproceedings{krizhevsky2012imagenet,
 87 |   title={Imagenet classification with deep convolutional neural networks},
 88 |   author={Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E},
 89 |   booktitle={Advances in neural information processing systems},
 90 |   pages={1097--1105},
 91 |   year={2012}
 92 | }
 93 | 
 94 | @article{carlini2016defensive,
 95 |   title={Defensive distillation is not robust to adversarial examples},
 96 |   author={Carlini, Nicholas and Wagner, David},
 97 |   journal={arXiv preprint arXiv:1607.04311},
 98 |   year={2016}
 99 | }
100 | 
101 | @inproceedings{carlini2017towards,
102 |   title={Towards evaluating the robustness of neural networks},
103 |   author={Carlini, Nicholas and Wagner, David},
104 |   booktitle={2017 IEEE Symposium on Security and Privacy (SP)},
105 |   pages={39--57},
106 |   year={2017},
107 |   organization={IEEE}
108 | }
109 | 
110 | @inproceedings{carlini2017adversarial,
111 |   title={Adversarial examples are not easily detected: Bypassing ten detection methods},
112 |   author={Carlini, Nicholas and Wagner, David},
113 |   booktitle={Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security},
114 |   pages={3--14},
115 |   year={2017},
116 |   organization={ACM}
117 | }
118 | 
119 | @article{athalye2018obfuscated,
120 |   title={Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples},
121 |   author={Athalye, Anish and Carlini, Nicholas and Wagner, David},
122 |   journal={arXiv preprint arXiv:1802.00420},
123 |   year={2018}
124 | }
125 | 
126 | @article{engstrom2018evaluating,
127 |   title={Evaluating and understanding the robustness of adversarial logit pairing},
128 |   author={Engstrom, Logan and Ilyas, Andrew and Athalye, Anish},
129 |   journal={arXiv preprint arXiv:1807.10272},
130 |   year={2018}
131 | }
132 | 
133 | @article{athalye2018robustness,
134 |   title={On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses},
135 |   author={Athalye, Anish and Carlini, Nicholas},
136 |   journal={arXiv preprint arXiv:1804.03286},
137 |   year={2018}
138 | }
139 | 
140 | @article{mosbach2018logit,
141 |   title={Logit Pairing Methods Can Fool Gradient-Based Attacks},
142 |   author={Mosbach, Marius and Andriushchenko, Maksym and Trost, Thomas and Hein, Matthias and Klakow, Dietrich},
143 |   journal={arXiv preprint arXiv:1810.12042},
144 |   year={2018}
145 | }
146 | 
147 | @article{brendel2017comment,
148 |   title={Comment on "Biologically inspired protection of deep networks from adversarial attacks"},
149 |   author={Brendel, Wieland and Bethge, Matthias},
150 |   journal={arXiv preprint arXiv:1704.01547},
151 |   year={2017}
152 | }
153 | 
154 | @article{he2018decision,
155 |   title={Decision boundary analysis of adversarial examples},
156 |   author={He, Warren and Li, Bo and Song, Dawn},
157 |   journal={International Conference on Learning Representations}, 
158 |   year={2018}
159 | }
160 | 
161 | @article{carlini2017magnet,
162 |   title={Magnet and "efficient defenses against adversarial attacks" are not robust to adversarial examples},
163 |   author={Carlini, Nicholas and Wagner, David},
164 |   journal={arXiv preprint arXiv:1711.08478},
165 |   year={2017}
166 | }
167 | 
168 | @article{sharma2017breaking,
169 |   title={Breaking the Madry Defense Model with $ L\_1 $-based Adversarial Examples},
170 |   author={Sharma, Yash and Chen, Pin-Yu},
171 |   journal={arXiv preprint arXiv:1710.10733},
172 |   year={2017}
173 | }
174 | 
175 | @article{sharma2018bypassing,
176 |   title={Bypassing Feature Squeezing by Increasing Adversary Strength},
177 |   author={Sharma, Yash and Chen, Pin-Yu},
178 |   journal={arXiv preprint arXiv:1803.09868},
179 |   year={2018}
180 | }
181 | 
182 | @inproceedings{lu2018limitation,
183 |   title={On the limitation of magnet defense against L1-based adversarial examples},
184 |   author={Lu, Pei-Hsuan and Chen, Pin-Yu and Chen, Kang-Cheng and Yu, Chia-Mu},
185 |   booktitle={2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W)},
186 |   pages={200--214},
187 |   year={2018},
188 |   organization={IEEE}
189 | }
190 | 
191 | @article{lu2018blimitation,
192 |   title={On the Limitation of Local Intrinsic Dimensionality for Characterizing the Subspaces of Adversarial Examples},
193 |   author={Lu, Pei-Hsuan and Chen, Pin-Yu and Yu, Chia-Mu},
194 |   journal={arXiv preprint arXiv:1803.09638},
195 |   year={2018}
196 | }
197 | 
198 | @article{song2018generative,
199 |   title={Generative Adversarial Examples},
200 |   author={Song, Yang and Shu, Rui and Kushman, Nate and Ermon, Stefano},
201 |   journal={arXiv preprint arXiv:1805.07894},
202 |   year={2018}
203 | }
204 | 
205 | @article{szegedy2013intriguing,
206 |   title={Intriguing properties of neural networks},
207 |   author={Szegedy, Christian and Zaremba, Wojciech and Sutskever, Ilya and Bruna, Joan and Erhan, Dumitru and Goodfellow, Ian and Fergus, Rob},
208 |   journal={arXiv preprint arXiv:1312.6199},
209 |   year={2013}
210 | }
211 | 
212 | @inproceedings{biggio2013evasion,
213 |   title={Evasion attacks against machine learning at test time},
214 |   author={Biggio, Battista and Corona, Igino and Maiorca, Davide and Nelson, Blaine and {\v{S}}rndi{\'c}, Nedim and Laskov, Pavel and Giacinto, Giorgio and Roli, Fabio},
215 |   booktitle={Joint European conference on machine learning and knowledge discovery in databases},
216 |   pages={387--402},
217 |   year={2013},
218 |   organization={Springer}
219 | }
220 | 
221 | @article{madry2017towards,
222 |   title={Towards deep learning models resistant to adversarial attacks},
223 |   author={Madry, Aleksander and Makelov, Aleksandar and Schmidt, Ludwig and Tsipras, Dimitris and Vladu, Adrian},
224 |   journal={arXiv preprint arXiv:1706.06083},
225 |   year={2017}
226 | }
227 | 
228 | @article{goodfellow2014explaining,
229 |   title={Explaining and harnessing adversarial examples (2014)},
230 |   author={Goodfellow, Ian J and Shlens, Jonathon and Szegedy, Christian},
231 |   year={2014},
232 |   journal={arXiv preprint arXiv:1412.6572}
233 | }
234 | 
235 | @article{tramer2017ensemble,
236 |   title={Ensemble adversarial training: Attacks and defenses},
237 |   author={Tram{\`e}r, Florian and Kurakin, Alexey and Papernot, Nicolas and Goodfellow, Ian and Boneh, Dan and McDaniel, Patrick},
238 |   journal={arXiv preprint arXiv:1705.07204},
239 |   year={2017}
240 | }
241 | 
242 | @article{qian2018l2,
243 |   title={L2-Nonexpansive Neural Networks},
244 |   author={Qian, Haifeng and Wegman, Mark N},
245 |   journal={arXiv preprint arXiv:1802.07896},
246 |   year={2018}
247 | }
248 | 
249 | @inproceedings{moosavi2016deepfool,
250 |   title={Deepfool: a simple and accurate method to fool deep neural networks},
251 |   author={Moosavi-Dezfooli, Seyed-Mohsen and Fawzi, Alhussein and Frossard, Pascal},
252 |   booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
253 |   pages={2574--2582},
254 |   year={2016}
255 | }
256 | 
257 | @inproceedings{chen2017zoo,
258 |   title={Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models},
259 |   author={Chen, Pin-Yu and Zhang, Huan and Sharma, Yash and Yi, Jinfeng and Hsieh, Cho-Jui},
260 |   booktitle={Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security},
261 |   pages={15--26},
262 |   year={2017}
263 | }
264 | 
265 | @article{chen2017ead,
266 |   title={EAD: elastic-net attacks to deep neural networks via adversarial examples},
267 |   author={Chen, Pin-Yu and Sharma, Yash and Zhang, Huan and Yi, Jinfeng and Hsieh, Cho-Jui},
268 |   journal={arXiv preprint arXiv:1709.04114},
269 |   year={2017}
270 | }
271 | 
272 | @article{uesato2018adversarial,
273 |   title={Adversarial risk and the dangers of evaluating against weak attacks},
274 |   author={Uesato, Jonathan and O'Donoghue, Brendan and Oord, Aaron van den and Kohli, Pushmeet},
275 |   journal={arXiv preprint arXiv:1802.05666},
276 |   year={2018}
277 | }
278 | 
279 | @article{ilyas2018black,
280 |   title={Black-box Adversarial Attacks with Limited Queries and Information},
281 |   author={Ilyas, Andrew and Engstrom, Logan and Athalye, Anish and Lin, Jessy},
282 |  	journal	= {International Conference on Machine Learning (ICML)},
283 |    	year		= {2018}
284 | }
285 | 
286 | @article{brendel2017decision,
287 |   title={Decision-based adversarial attacks: Reliable attacks against black-box machine learning models},
288 |   author={Brendel, Wieland and Rauber, Jonas and Bethge, Matthias},
289 |   journal={arXiv preprint arXiv:1712.04248},
290 |   year={2017}
291 | }
292 | 
293 | @article{papernot2016transferability,
294 |   title={Transferability in machine learning: from phenomena to black-box attacks using adversarial samples},
295 |   author={Papernot, Nicolas and McDaniel, Patrick and Goodfellow, Ian},
296 |   journal={arXiv preprint arXiv:1605.07277},
297 |   year={2016}
298 | }
299 | 
300 | @inproceedings{papernot2016limitations,
301 | 	title={The limitations of deep learning in adversarial settings},
302 | 	author={Papernot, Nicolas and McDaniel, Patrick and Jha, Somesh and Fredrikson, Matt and Celik, Z Berkay and Swami, Ananthram},
303 | 	booktitle={Security and Privacy (EuroS\&P), 2016 IEEE European Symposium on},
304 | 	pages={372--387},
305 | 	year={2016},
306 | 	organization={IEEE}
307 | }
308 | 
309 | @article{liu2016delving,
310 |   title={Delving into transferable adversarial examples and black-box attacks},
311 |   author={Liu, Yanpei and Chen, Xinyun and Liu, Chang and Song, Dawn},
312 |   journal={arXiv preprint arXiv:1611.02770},
313 |   year={2016}
314 | }
315 | 
316 | @article{he2017adversarial,
317 |   title={Adversarial example defenses: Ensembles of weak defenses are not strong},
318 |   author={He, Warren and Wei, James and Chen, Xinyun and Carlini, Nicholas and Song, Dawn},
319 |   journal={arXiv preprint arXiv:1706.04701},
320 |   year={2017},
321 |   publisher={arXivpreprint}
322 | }
323 | 
324 | @inproceedings{herley2017sok,
325 |   title={Sok: Science, security and the elusive goal of security as a scientific pursuit},
326 |   author={Herley, Cormac and van Oorschot, Paul C},
327 |   booktitle={Security and Privacy (SP), 2017 IEEE Symposium on},
328 |   pages={99--120},
329 |   year={2017},
330 |   organization={IEEE}
331 | }
332 | 
333 | @article{papernot2018cleverhans,
334 |   title={Technical Report on the CleverHans v2.1.0 Adversarial Examples Library},
335 |   author={Nicolas Papernot and Fartash Faghri and Nicholas Carlini and
336 |   Ian Goodfellow and Reuben Feinman and Alexey Kurakin and Cihang Xie and
337 |   Yash Sharma and Tom Brown and Aurko Roy and Alexander Matyasko and
338 |   Vahid Behzadan and Karen Hambardzumyan and Zhishuai Zhang and
339 |   Yi-Lin Juang and Zhi Li and Ryan Sheatsley and Abhibhav Garg and
340 |   Jonathan Uesato and Willi Gierke and Yinpeng Dong and David Berthelot and
341 |   Paul Hendricks and Jonas Rauber and Rujun Long},
342 |   journal={arXiv preprint arXiv:1610.00768},
343 |   year={2018}
344 | }
345 | 
346 | @article{rauber2017foolbox,
347 |   title={Foolbox v0.8.0: A python toolbox to benchmark the robustness of machine learning models},
348 |   author={Rauber, Jonas and Brendel, Wieland and Bethge, Matthias},
349 |   journal={arXiv preprint arXiv:1707.04131},
350 |   year={2017}
351 | }
352 | 
353 | @inproceedings{dahl2013large,
354 |   title={Large-scale malware classification using random projections and neural networks},
355 |   author={Dahl, George E and Stokes, Jack W and Deng, Li and Yu, Dong},
356 |   booktitle={Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on},
357 |   pages={3422--3426},
358 |   year={2013},
359 |   organization={IEEE}
360 | }
361 | 
362 | @article{tramer2018ad,
363 |   title={Ad-versarial: Defeating Perceptual Ad-Blocking},
364 |   author={Tram{\`e}r, Florian and Dupr{\'e}, Pascal and Rusak, Gili and Pellegrino, Giancarlo and Boneh, Dan},
365 |   journal={arXiv preprint arXiv:1811.03194},
366 |   year={2018}
367 | }
368 | 
369 | @inproceedings{carlini2016hidden,
370 |   title={Hidden Voice Commands.},
371 |   author={Carlini, Nicholas and Mishra, Pratyush and Vaidya, Tavish and Zhang, Yuankai and Sherr, Micah and Shields, Clay and Wagner, David and Zhou, Wenchao},
372 |   booktitle={USENIX Security Symposium},
373 |   pages={513--530},
374 |   year={2016}
375 | }
376 | 
377 | @article{kolter2017provable,
378 |   title={Provable defenses against adversarial examples via the convex outer adversarial polytope},
379 |   author={Kolter, J Zico and Wong, Eric},
380 |   journal={arXiv preprint arXiv:1711.00851},
381 |   volume={1},
382 |   number={2},
383 |   pages={3},
384 |   year={2017}
385 | }
386 | 
387 | @article{raghunathan2018certified,
388 |   title={Certified defenses against adversarial examples},
389 |   author={Raghunathan, Aditi and Steinhardt, Jacob and Liang, Percy},
390 |   journal={arXiv preprint arXiv:1801.09344},
391 |   year={2018}
392 | }
393 | 
394 | @article{sinha2018certifying,
395 |   title={Certifying some distributional robustness with principled adversarial training},
396 |   author={Sinha, Aman and Namkoong, Hongseok and Duchi, John},
397 |   journal={International Conference on Learning Representations},
398 |   year={2018}
399 | }
400 | 
401 | @inproceedings{lecuyer2018certified,
402 |   title={Certified robustness to adversarial examples with differential privacy},
403 |   author={Lecuyer, Mathias and Atlidakis, Vaggelis and Geambasu, Roxana and Hsu, Daniel and Jana, Suman},
404 |   booktitle={Certified Robustness to Adversarial Examples with Differential Privacy},
405 |   pages={0},
406 |   year={2018},
407 |   organization={IEEE}
408 | }
409 | 
410 | @inproceedings{katz2017reluplex,
411 |   title={Reluplex: An efficient SMT solver for verifying deep neural networks},
412 |   author={Katz, Guy and Barrett, Clark and Dill, David L and Julian, Kyle and Kochenderfer, Mykel J},
413 |   booktitle={International Conference on Computer Aided Verification},
414 |   pages={97--117},
415 |   year={2017},
416 |   organization={Springer}
417 | }
418 | 
419 | @inproceedings{elsayed2018adversarial,
420 |   title={Adversarial examples that fool both computer vision and time-limited humans},
421 |   author={Elsayed, Gamaleldin and Shankar, Shreya and Cheung, Brian and Papernot, Nicolas and Kurakin, Alexey and Goodfellow, Ian and Sohl-Dickstein, Jascha},
422 |   booktitle={Advances in Neural Information Processing Systems},
423 |   pages={3914--3924},
424 |   year={2018}
425 | }
426 | 
427 | @inproceedings{bhagoji2018practical,
428 |   title={Practical Black-Box Attacks on Deep Neural Networks Using Efficient Query Mechanisms},
429 |   author={Bhagoji, Arjun Nitin and He, Warren and Li, Bo and Song, Dawn},
430 |   booktitle={European Conference on Computer Vision},
431 |   pages={158--174},
432 |   year={2018},
433 |   organization={Springer}
434 | }
435 | 
436 | @article{gilmer2018motivating,
437 |   title={Motivating the rules of the game for adversarial example research},
438 |   author={Gilmer, Justin and Adams, Ryan P and Goodfellow, Ian and Andersen, David and Dahl, George E},
439 |   journal={arXiv preprint arXiv:1807.06732},
440 |   year={2018}
441 | }
442 | 
443 | @article{engstrom2017rotation,
444 |   title={A rotation and a translation suffice: Fooling {CNNs} with simple transformations},
445 |   author={Engstrom, Logan and Tran, Brandon and Tsipras, Dimitris and Schmidt, Ludwig and Madry, Aleksander},
446 |   journal={arXiv preprint arXiv:1712.02779},
447 |   year={2017}
448 | }
449 | 
450 | @article{kurakin2016adversarial,
451 |   title={Adversarial examples in the physical world},
452 |   author={Kurakin, Alexey and Goodfellow, Ian and Bengio, Samy},
453 |   journal={arXiv preprint arXiv:1607.02533},
454 |   year={2016}
455 | }
456 | 
457 | @article{saltzer1975protection,
458 |        title={The protection of information in computer systems},
459 |        author={Saltzer, Jerome H and Schroeder, Michael D},
460 |        journal={Proceedings of the IEEE},
461 |        volume={63},
462 |        number={9},
463 |        pages={1278--1308},
464 |        year={1975},
465 |        publisher={IEEE}
466 | }
467 | 
468 | @article{kerckhoffs1883cryptographic,
469 |        title={La cryptographie militaire},
470 |        author={Kerckhoffs, Auguste},
471 |        journal={Journal des sciences militaires},
472 |        pages={5--38},
473 |        year={1883}
474 | }
475 | 
476 | @inproceedings{papernot2017practical,
477 |        title={Practical black-box attacks against machine learning},
478 |        author={Papernot, Nicolas and McDaniel, Patrick and Goodfellow, Ian and Jha, Somesh and Celik, Z Berkay and Swami, Ananthram},
479 |        booktitle={Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security},
480 |        pages={506--519},
481 |        year={2017},
482 |        organization={ACM}
483 | }
484 | 
485 | @article{cornelius2019efficacy,
486 |   title={The Efficacy of SHIELD under Different Threat Models},
487 |   author={Cornelius, Cory},
488 |   journal={arXiv preprint arXiv:1902.00541},
489 |   year={2019}
490 | }
491 | 
492 | @article{carlini2019ami,
493 |   title={Is AmI (Attacks Meet Interpretability) Robust to Adversarial Examples?},
494 |   author={Carlini, Nicholas},
495 |   journal={arXiv preprint arXiv:1902.02322},
496 |   year={2019}
497 | }
498 | 
499 | @article{schott2018,
500 |   title={Towards the first adversarially robust neural network model on MNIST},
501 |   author={Schott, Lukas and Rauber, Jonas and Bethge, Matthias and Brendel, Wieland},
502 |   journal={International Conference for Learning Representations 2019},
503 |   year={2019}
504 | }
505 | 
506 | @article{ford2019adversarial,
507 |   title={Adversarial Examples Are a Natural Consequence of Test Error in Noise},
508 |   author={Ford, Nic and Gilmer, Justin and Carlini, Nicolas and Cubuk, Dogus},
509 |   journal={arXiv preprint arXiv:1901.10513},
510 |   year={2019}
511 | }
512 | 
513 | @article{hendrycks2018benchmarking,
514 |   title={Benchmarking Neural Network Robustness to Common Corruptions and Perturbations},
515 |   author={Hendrycks, Dan and Dietterich, Thomas},
516 |   journal={International Conference on Learning Representations (ICLR)},
517 |   year={2019}
518 | }
519 | 
520 | @inproceedings{fawzi2016robustness,
521 |   title={Robustness of classifiers: from adversarial to random noise},
522 |   author={Fawzi, Alhussein and Moosavi-Dezfooli, Seyed-Mohsen and Frossard, Pascal},
523 |   booktitle={Advances in Neural Information Processing Systems},
524 |   pages={1632--1640},
525 |   year={2016}
526 | }
527 | 
528 | @inproceedings{jetley2018friends,
529 |   title={With friends like these, who needs adversaries?},
530 |   author={Jetley, Saumya and Lord, Nicholas and Torr, Philip},
531 |   booktitle={Advances in Neural Information Processing Systems},
532 |   pages={10772--10782},
533 |   year={2018}
534 | }
535 | 
536 | @inproceedings{TjengXT19,
537 | 	author={Vincent Tjeng and Kai Xiao and Russ Tedrake},
538 |   	title={Evaluating Robustness of Neural Networks with Mixed Integer Programming},
539 | 	Booktitle	= {International Conference on Learning Representations (ICLR)},
540 |   	Year		= {2019}
541 | }
542 | 
543 | @inproceedings{TuTCLZYHC18,
544 |     title={AutoZOOM: Autoencoder-based Zeroth Order Optimization Method for Attacking Black-box Neural Networks},
545 |     author={Chun-Chen Tu and Paishun Ting and Pin-Yu Chen and Sijia Liu and Huan Zhang and Jinfeng Yi and Cho-Jui Hsieh and Shin-Ming Cheng},
546 |     year={2018},
547 |    Booktitle	= {ArXiv preprint arXiv:1805.11770}
548 | }
549 | 
550 | @inproceedings{IlyasEM18,
551 |   title={Prior Convictions: Black-box Adversarial Attacks with Bandits and Priors},
552 |   author={Ilyas, Andrew and Engstrom, Logan and Madry, Aleksander},
553 |   booktitle	= {International Conference on Learning Representations (ICLR)},
554 |   year={2018}
555 | }
556 | 
557 | @inproceedings{LiuCLS17,
558 |    title={Delving into Transferable Adversarial Examples and Black-box Attacks},
559 |     author={Yanpei Liu and Xinyun Chen and Chang Liu and Dawn Song},
560 |   booktitle	= {International Conference on Learning Representations (ICLR)},
561 |   year={2017}
562 | }
563 | 
564 | @inproceedings{GowalDSBQUAMK18,
565 |  title={On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models},
566 |  author={Sven Gowal and Krishnamurthy Dvijotham and Robert Stanforth and Rudy Bunel and Chongli Qin and Jonathan Uesato and Relja Arandjelovic and Timothy Mann and Pushmeet Kohli},
567 |  booktitle	= {ArXiv preprint arXiv:1810.12715},
568 |   year={2018}
569 | }
570 | 
571 | @inproceedings{WengZCSHBDD18,
572 |   title={Towards fast computation of certified robustness for {ReLU} networks},
573 |   author={Weng, Tsui-Wei and Zhang, Huan and Chen, Hongge and Song, Zhao and Hsieh, Cho-Jui and Boning, Duane and Dhillon, Inderjit S and Daniel, Luca},
574 |   booktitle={International Conference on Machine Learning (ICML)},
575 |   year={2018}
576 | }
577 | 
578 | @inproceedings{XiaoTSM19,
579 | 	  author    = {Kai Y. Xiao and
580 | 	               Vincent Tjeng and
581 | 	               Nur Muhammad Shafiullah and
582 | 	               Aleksander Madry},
583 |   	Title		= {Training for Faster Adversarial Robustness Verification via Inducing
584 |   	               {ReLU} Stability},
585 | 	Booktitle	= {International Conference on Learning Representations (ICLR)},
586 |   	Year		= {2019}
587 | }
588 | 
589 | @InProceedings{AthalyeEIK18,
590 |   title = 	 {Synthesizing Robust Adversarial Examples},
591 |   author = 	 {Athalye, Anish and Engstrom, Logan and Ilyas, Andrew and Kwok, Kevin},
592 |   booktitle = 	 {International Conference on Machine Learning (ICML)},
593 |   pages = 	 {284--293},
594 |   year = 	 {2018}
595 | }
596 | 
597 | 


--------------------------------------------------------------------------------
/paper.tex:
--------------------------------------------------------------------------------
   1 | \documentclass{article} % For LaTeX2e
   2 | \usepackage{iclr2019_conference}
   3 | \usepackage{natbib}
   4 | \usepackage{enumitem,amssymb}
   5 | \usepackage{amsmath}
   6 | \usepackage{amsfonts}
   7 | \usepackage{marginnote}
   8 | \usepackage{graphicx}
   9 | \usepackage[T1]{fontenc}
  10 | \usepackage{ifthen}
  11 | \usepackage{url}
  12 | %\usepackage{titling}
  13 | \usepackage{fancyhdr}
  14 | \usepackage{hyperref}
  15 | 
  16 | \newcommand\hr{\par\vspace{-.5\ht\strutbox}\noindent\hrulefill\par}
  17 | 
  18 | \input{version}
  19 | \pdfinfo{
  20 |     /Revision (\version)
  21 | }
  22 | 
  23 | %\title{Recommendations for Evaluating \\ Adversarial Example Defenses}
  24 | %\title{Advice on Evaluating \\ Adversarial Example Defenses}
  25 | %\title{Advice on Adversarial Example \\ Defense Evaluations}
  26 | %\title{Recommendations and a Checklist (v0) for Evaluating Adversarial Example Defenses}
  27 | 
  28 | \iclrfinalcopy % Uncomment for camera-ready version, but NOT for submission.
  29 | \begin{document}
  30 | 
  31 | 
  32 | \title{\vspace{5em}On Evaluating Adversarial Robustness}
  33 | \fancyhead[R]{On Evaluating Adversarial Robustness}
  34 | \thispagestyle{empty}
  35 | \maketitle
  36 | 
  37 |   Nicholas Carlini\textsuperscript{1},
  38 |   Anish Athalye\textsuperscript{2},
  39 |   Nicolas Papernot\textsuperscript{1},
  40 |   Wieland Brendel\textsuperscript{3},
  41 |   Jonas Rauber\textsuperscript{3},
  42 |   Dimitris Tsipras\textsuperscript{2},
  43 |   Ian Goodfellow\textsuperscript{1},
  44 |   Aleksander M\k{a}dry\textsuperscript{2},
  45 |   Alexey Kurakin\textsuperscript{1}\,\textsuperscript{*}
  46 |   \\
  47 |   \noindent \\
  48 |   \noindent \\
  49 |   \textsuperscript{1} Google Brain
  50 |   \textsuperscript{2} MIT 
  51 |   \textsuperscript{3} University of T\"ubingen
  52 |   \\
  53 |   \noindent \\
  54 |   \noindent \\
  55 |   \textsuperscript{*} List of authors is dynamic and subject to change.
  56 |   Authors are ordered according to the amount of their contribution
  57 |   to the text of the paper.
  58 | 
  59 | 
  60 | \vspace{20em}
  61 | Please direct correspondence to the GitHub repository \\
  62 | \url{https://github.com/evaluating-adversarial-robustness/adv-eval-paper} \\
  63 | \noindent \\
  64 | Last Update: \today\ (revision \texttt{\version}).
  65 | \newpage
  66 | 
  67 | \reversemarginpar
  68 | 
  69 |   \vspace{-1em}
  70 | \begin{abstract}
  71 |     \input{abstract.tex}
  72 | \end{abstract}
  73 | 
  74 | 
  75 | \section{Introduction}
  76 | 
  77 | Adversarial examples \citep{szegedy2013intriguing,biggio2013evasion},
  78 | inputs that are specifically designed by an
  79 | adversary to force a machine learning system to produce erroneous outputs, 
  80 | have seen significant study in
  81 | recent years.
  82 | %
  83 | This long line of research
  84 | \citep{dalvi2004adversarial,lowd2005adversarial,barreno2006can,barreno2010security,globerson2006nightmare,kolcz2009feature,barreno2010security,biggio2010multiple,vsrndic2013detection}
  85 | has recently begun seeing significant study as machine learning
  86 | becomes more widely used.
  87 | %
  88 | While attack research (the study of
  89 | adversarial examples on new domains or under new threat
  90 | models) has flourished,
  91 | progress on defense%
  92 | \footnote{This paper uses the word ``defense'' with the
  93 | understanding that there are non-security
  94 | motivations for constructing machine learning
  95 | algorithms that are robust to attacks
  96 | (see Section~\ref{sec:defense_research_motivation});
  97 | we use this consistent terminology for simplicity.}
  98 | research (i.e., building systems that
  99 | are robust to adversarial examples) has been comparatively slow.
 100 | 
 101 | More concerning than the fact that progress is slow is the fact that
 102 | most proposed defenses are quickly shown to have performed
 103 | incorrect or incomplete evaluations
 104 | \citep{carlini2016defensive,carlini2017towards,brendel2017comment,carlini2017adversarial,he2017adversarial,carlini2017magnet,athalye2018obfuscated,engstrom2018evaluating,athalye2018robustness,uesato2018adversarial,mosbach2018logit,he2018decision,sharma2018bypassing,lu2018limitation,lu2018blimitation,cornelius2019efficacy,carlini2019ami}.
 105 | %
 106 | As a result, navigating the field and identifying genuine progress becomes particularly hard.
 107 | 
 108 | Informed by these recent results, this paper provides practical advice
 109 | for evaluating defenses that are intended to be robust to adversarial examples.
 110 | %
 111 | This paper is split roughly in two:
 112 | \begin{itemize}
 113 | \item \S\ref{sec:doing_good_science}: \emph{Principles for performing defense evaluations.}
 114 |   We begin with a discussion of the basic principles and methodologies that should guide defense evaluations.
 115 | \item \S\ref{sec:dont_do_bad_science}--\S\ref{sec:analysis}: \emph{A specific checklist for avoiding common evaluation pitfalls.}
 116 |   We have seen evaluations fail for many reasons; this checklist outlines
 117 |   the most common errors we have seen in defense evaluations so they can be
 118 |   avoided.
 119 | \end{itemize}
 120 | %
 121 | We hope this advice will be useful to both
 122 | those building defenses (by proposing evaluation methodology and
 123 | suggesting experiments that should be run)
 124 | as well as readers or reviewers of defense papers (to identify potential
 125 | oversights in a paper's evaluation).
 126 | 
 127 | We intend for this to be a living document.
 128 | %
 129 | The LaTeX source for the paper is available at
 130 | \url{https://github.com/evaluating-adversarial-robustness/adv-eval-paper} and we encourage researchers to participate
 131 | and further improve this paper.
 132 | 
 133 | \newpage
 134 | 
 135 | \section{Principles of Rigorous Evaluations}
 136 | \label{sec:doing_good_science}
 137 | 
 138 | \subsection{Defense Research Motivation}
 139 | \label{sec:defense_research_motivation}
 140 | 
 141 | Before we begin discussing our recommendations for performing defense evaluations,
 142 | it is useful to briefly consider \emph{why} we are performing the evaluation in
 143 | the first place.
 144 | %
 145 | While there are many valid reasons to study defenses to adversarial examples, below are the
 146 | three common reasons why one might be interested in evaluating the
 147 | robustness of a machine learning model.
 148 | 
 149 | \begin{itemize}
 150 | 
 151 | \item \textbf{To defend against an adversary who will attack the system.}
 152 |   %
 153 |   Adversarial examples are a security concern.
 154 |   %
 155 |   Just like any new technology not designed with security in mind,
 156 |   when deploying a machine learning system in the real-world,
 157 |   there will be adversaries who wish to cause harm as long as there
 158 |   exist incentives (i.e., they benefit from the system misbehaving).
 159 |   %
 160 |   Exactly what this
 161 |   harm is and how the adversary will go about causing it depends on the details of
 162 |   the domain and the adversary considered.
 163 |   %
 164 |   For example, an attacker may wish to cause a self-driving car to
 165 |   incorrectly recognize road signs\footnote{While this threat model is often repeated
 166 |   	in the literature, it may have limited impact for
 167 |   	real-world adversaries, who in practice may have
 168 |         have little financial motivation to
 169 |   	cause harm to self-driving cars.}~\citep{papernot2016limitations},
 170 |   cause an NSFW detector to incorrectly
 171 |   recognize an image as safe-for-work~\citep{bhagoji2018practical},
 172 |   cause a malware (or spam) classifier
 173 |   to identify a malicious file (or spam email) as benign~\citep{dahl2013large}, cause an
 174 |   ad-blocker to incorrectly identify an advertisement as natural content~\citep{tramer2018ad},
 175 |   or cause a digital assistant to incorrectly recognize commands it is given \citep{carlini2016hidden}.
 176 | 
 177 | \item \textbf{To test the worst-case robustness of machine learning algorithms.}
 178 |   %
 179 |   Many real-world environments have inherent randomness that is difficult to
 180 |   predict.
 181 |   %
 182 |   By analyzing the robustness of a model from the perspective of an
 183 |   adversary, we can estimate the \emph{worst-case} robustness
 184 |   in a real-world setting.
 185 |   %
 186 |   Through random testing,
 187 |   it can be difficult to distinguish a system that fails
 188 |   one time in a billion from a system that never fails: even when evaluating
 189 |   such a system on a million choices of randomness, there is just under $0.1\%$ chance
 190 |   to detect a failure case.
 191 | 
 192 |   However, analyzing the worst-case robustness can discover a difference.
 193 |   %
 194 |   If a powerful adversary who is intentionally trying to cause a system to misbehave (according
 195 |   to some definition) cannot
 196 |   succeed, then we have strong evidence that the system will not misbehave due to
 197 |   \emph{any} unforeseen randomness.
 198 | 
 199 | \item \textbf{To measure progress of machine learning algorithms
 200 |   towards human-level abilities.}
 201 |   To advance machine learning algorithms it is important to understand where they fail.
 202 |   %
 203 |   In terms of performance,
 204 |   the gap between humans and machines is quite small on many widely
 205 |   studied problem domains, including reinforcement
 206 |   learning (e.g., Go and Chess \citep{silver2016mastering})
 207 |   or natural image classification \citep{krizhevsky2012imagenet}.
 208 |   In terms of adversarial robustness, however, the gap
 209 |   between humans and machines is astonishingly large: even in
 210 |   settings where machine learning achieves
 211 |   super-human accuracy, an adversary can often introduce perturbations that
 212 |   reduce their accuracy to levels of random guessing and far below the
 213 |   accuracy of even the most uninformed human.%
 214 |   \footnote{Note that time-limited humans appear
 215 |     vulnerable to some forms of adversarial examples~\citep{elsayed2018adversarial}.}
 216 |   %
 217 |   This suggests a fundamental difference of the
 218 |   decision-making process of humans and machines.
 219 |   %
 220 |   From this point of
 221 |   view, adversarial robustness is a measure of progress in machine
 222 |   learning that is orthogonal to performance.
 223 | 
 224 | \end{itemize}
 225 | 
 226 | The motivation for why the research was conducted informs the
 227 | methodology through which it should be evaluated:
 228 | a paper that sets out to prevent a real-world adversary from fooling a
 229 | specific spam detector assuming the adversary can not directly access
 230 | the underlying model will have a very different evaluation than one that
 231 | sets out to measure the worst-case robustness of a self-driving car's vision system.
 232 | 
 233 | This paper therefore does not (and could not) set out to provide a definitive
 234 | answer for how all evaluations should be performed.
 235 | %
 236 | Rather, we discuss methodology that we believe is common to most evaluations.
 237 | %
 238 | Whenever we provide recommendations that may not apply to some class of
 239 | evaluations, we state this fact explicitly.
 240 | %
 241 | Similarly, for
 242 | advice we believe holds true universally, we discuss why this is the case,
 243 | especially when it may not be obvious at first.
 244 | 
 245 | The remainder of this section provides an overview of the basic methodology
 246 | for a defense evaluation.
 247 | 
 248 | 
 249 | \subsection{Threat Models}
 250 | \label{sec:threatmodel}
 251 | A threat model specifies the conditions under which a defense
 252 | is designed to be secure and the precise security guarantees provided;
 253 | it is an integral component of the defense itself.
 254 | 
 255 | Why is it important to have a threat model? In the context of a defense where
 256 | the purpose is motivated by security, the threat model outlines what type of
 257 | actual attacker the defense intends to defend against, guiding the evaluation of the defense.
 258 | 
 259 | However, even in the context of a defense motivated by reasons beyond
 260 | security, a threat model is necessary for evaluating the performance of the defense.
 261 | %
 262 | One of the defining
 263 | properties of scientific research is that it is \emph{falsifiable}:
 264 | there must exist an experiment that can contradict its claims.
 265 | %
 266 | Without a threat model, defense proposals are often either not falsifiable
 267 | or trivially falsifiable.
 268 | 
 269 | Typically, a threat model includes a set of assumptions about the adversary's \textit{goals},
 270 | \textit{knowledge}, and  \textit{capabilities}.
 271 | %
 272 | Next, we briefly describe each.
 273 | 
 274 | \subsubsection{Adversary goals}
 275 | How should we define an \textbf{adversarial example}?
 276 | %
 277 | At a high level, adversarial examples can be defined as inputs
 278 | specifically designed to force a machine learning system to
 279 | produce erroneous outputs.
 280 | %
 281 | However, the precise goal of an adversary can vary significantly
 282 | across different settings.
 283 | 
 284 | For example, in some cases the adversary's goal may be to simply
 285 | cause misclassification---any input being misclassified
 286 | represents a successful attack.
 287 | %
 288 | Alternatively,
 289 | the adversary may be interested in having the model misclassify certain
 290 | examples from a \textit{source} class into a \textit{target} class of their choice.
 291 | %
 292 | This has been referred to a \textit{source-target} misclassification attack~\citep{papernot2016limitations}
 293 | or \textit{targeted} attack~\citep{carlini2017towards}.
 294 | 
 295 | In other settings, only specific types of misclassification may be interesting.
 296 | %
 297 | In the space of malware detection, defenders
 298 | may only care about the specific source-target class pair where an adversary
 299 | causes a malicious program to be misclassified as benign; causing a benign program
 300 | to be misclassified as malware may be uninteresting.
 301 | 
 302 | \subsubsection{Adversarial capabilities}
 303 | In order to build meaningful defenses, we need to impose reasonable constraints to the attacker.
 304 | %
 305 | An unconstrained attacker who wished to cause harm may, for example,
 306 | cause bit-flips on the
 307 | weights of the neural network, cause errors in the data processing pipeline,
 308 | backdoor the machine learning model, or (perhaps more relevant) introduce
 309 | large perturbations to an image that would alter its semantics.
 310 | %
 311 | Since such attacks are outside the scope of defenses adversarial examples,
 312 | restricting the adversary is necessary for designing defenses that are
 313 | not trivially bypassed by unconstrained adversaries.
 314 | 
 315 | To date, most defenses to adversarial examples typically restrict the adversary to
 316 | making ``small'' changes to inputs from the data-generating
 317 | distribution (e.g. inputs from the test set).
 318 | %
 319 | Formally, for some natural input $x$ and similarity metric
 320 | $\mathcal{D}$, $x'$ is considered a valid adversarial example if
 321 | $\mathcal{D}(x, x')\leq \epsilon$ for some small $\epsilon$ and $x'$ is
 322 | misclassified\footnote{
 323 |   It is often required that the original input $x$ is classified
 324 |   correctly, but this requirement can vary across papers.
 325 |   Some papers consider $x'$ an adversarial example as long as it
 326 |   is classified \emph{differently} from $x$.}.
 327 | %
 328 | This definition is motivated by the assumption that small changes under
 329 | the metric $\mathcal{D}$ do not change the true class of the input and
 330 | thus should not cause the classifier to predict an erroneous class.
 331 | 
 332 | A common choice for $\mathcal{D}$, especially for the case of image
 333 | classification, is defining it as the $\ell_p$-norm between two
 334 | inputs for some $p$.
 335 | %
 336 | (For instance, an $\ell_\infty$-norm constraint of $\epsilon$ for image
 337 | classification implies that the adversary cannot modify any individual
 338 | pixel by more than $\epsilon$.)
 339 | However, a suitable choice of $\mathcal{D}$ and $\epsilon$
 340 | may vary significantly based on the particular task.
 341 | %
 342 | For example, for a task with binary features one may wish to
 343 | study $\ell_0$-bounded adversarial examples more closely
 344 | than $\ell_\infty$-bounded ones.
 345 | %
 346 | Moreover, restricting adversarial perturbations to be small may not
 347 | always be important: in the case of malware detection, what is
 348 | required is that the adversarial program preserves the malware
 349 | behavior while evading ML detection.
 350 | 
 351 | Nevertheless, such a rigorous and precise definition of the
 352 | adversary's capability, leads to well-defined measures of
 353 | adversarial robustness that are, in principle, computable.
 354 | %
 355 | For example, given a model $f(\cdot)$, one common way to define
 356 | robustness is the worst-case loss $L$ for a given perturbation budget,
 357 | \[
 358 | \mathop{\mathbb{E}}_{(x,y) \sim \mathcal{X}}\bigg[
 359 |   \max_{x' : \mathcal{D}(x,x') < \epsilon} L\big(f(x'),y\big) \bigg].
 360 | \]
 361 | Another commonly adopted definition is the average (or median)
 362 | minimum-distance of the adversarial perturbation,
 363 | \[
 364 | \mathop{\mathbb{E}}_{(x,y) \sim \mathcal{X}}\bigg[
 365 |   \min_{x' \in A_{x,y}} \mathcal{D}(x,x') \bigg],
 366 | \]
 367 | where $A_{x,y}$ depends on the definition of \emph{adversarial example},
 368 | e.g. $A_{x,y} = \{x' \mid f(x') \ne y\}$ for misclassification
 369 | or $A_{x,y} = \{x \mid f(x') = t\}$ for some target class $t$.
 370 | 
 371 | A key challenge of security evaluations is that while this
 372 | \emph{adversarial risk}~\citep{madry2017towards,uesato2018adversarial}
 373 | is often computable in theory (e.g. with optimal attacks or brute force enumeration of the considered perturbations),
 374 | it is usually intractable to compute exactly, and therefore
 375 |  in practice we must approximate this quantity.
 376 | %
 377 | This difficulty is at the heart of why evaluating worst-case robustness is difficult:
 378 | while evaluating average-case robustness is often as simple as sampling a few
 379 | hundred (or thousand) times from the distribution and computing the mean, such
 380 | an approach is not possible for worst-case robustness.
 381 | 
 382 | It is customary for defenses not to impose any computational bounds on an attacker 
 383 | (i.e., the above definitions of adversarial risk consider only the \emph{existence} of adversarial 
 384 | examples, and not the difficulty of finding them). We believe that restricting an adversary's 
 385 | computational power could be interesting if we could make formal statements about the 
 386 | computational cost of finding adversarial examples (e.g., via a reduction to a concrete hardness 
 387 | assumption as is done in cryptography). We are not aware of any defense that currently achieves 
 388 | this. Stronger restrictions (e.g., the adversary is limited to attacks with $100$ iterations) are 
 389 | usually uninteresting in that they cannot be meaningfully enforced and there is often no 
 390 | economic reason for an adversary not to spend a little more time to succeed.
 391 | 
 392 | Finally, a common, often implicit, assumption in adversarial example
 393 | research is that the adversary has direct access to the model's input
 394 | features: e.g., in the image domain, the adversary directly manipulates the image pixels.
 395 | %
 396 | However, in certain domains, such as malware detection or language
 397 | modeling, these features can be difficult to reverse-engineer.
 398 | %
 399 | As a result, different assumptions on the capabilities of the
 400 | adversary can significantly impact the evaluation of a defense's effectiveness.
 401 | 
 402 | \paragraph{Comment on $\ell_p$-norm-constrained threat models.}
 403 | %
 404 | A large body of work studies a threat model where the adversary is 
 405 | constrained to $\ell_p$-bounded perturbations.
 406 | %
 407 | This threat model is highly limited and does not perfectly match real-world
 408 | threats~\citep{engstrom2017rotation,gilmer2018motivating}.
 409 | %
 410 | However, the well-defined nature of this threat model is helpful
 411 | for performing principled work towards building strong defenses.
 412 | %
 413 | While $\ell_p$-robustness does not imply robustness in more realistic
 414 | threat models, it is almost certainly the case that lack of robustness
 415 | against $\ell_p$-bounded perturbation will imply lack of robustness in more
 416 | realistic threat models.
 417 | %
 418 | Thus, working towards solving robustness
 419 | for these well-defined $\ell_p$-bounded threat models
 420 | is a useful exercise.
 421 | 
 422 | \subsubsection{Adversary knowledge.}
 423 | A threat model clearly describes what knowledge the adversary
 424 | is assumed to have.
 425 | %
 426 | Typically, works assume either white-box  (complete knowledge
 427 | of the model and its parameters) or black-box access (no knowledge of the model)
 428 | with varying degrees of black-box access
 429 | (e.g., a limited number of queries to the model, access to the
 430 | predicted probabilities or just the predicted class,
 431 | or access to the training data).
 432 | 
 433 | In general, the guiding principle of a defense's threat model
 434 | is to assume that the adversary has complete 
 435 | knowledge of the inner workings of the
 436 | defense.
 437 | %
 438 | It is not reasonable to assume the defense
 439 | algorithm can be held secret, even in black-box threat models.
 440 | %
 441 | This widely-held principle is known in the field of security as Kerckhoffs'
 442 | principle~\citep{kerckhoffs1883cryptographic}, and the opposite is known as
 443 | ``security through obscurity''.
 444 | %
 445 | The open design of security mechanisms is
 446 | a cornerstone of the field of cryptography~\citep{saltzer1975protection}.
 447 | %
 448 | This paper discusses only how to perform white-box evaluations, which implies
 449 | robustness to black-box adversaries,
 450 | but not the other way around. 
 451 | 
 452 | \paragraph{Holding Data Secret.}
 453 | %
 454 | While it can be acceptable to hold some limited amount of information
 455 | secret, the defining characteristic of a white-box evaluation (as
 456 | we discuss in this paper) is that the threat model assumes
 457 | the attacker has \textbf{full knowledge} of the underlying system.
 458 | 
 459 | That does not mean that all information has to be available to the
 460 | adversary---it can be acceptable for the defender to hold a small
 461 | amount of information secret.
 462 | %
 463 | The field of cryptography, for example, is built around the idea that
 464 | one \emph{can} keep secret
 465 | the encryption keys, but the underlying algorithm is be assumed to be public.
 466 | 
 467 | A defense that holds values secret should justify that it is reasonable to do so.
 468 | %
 469 | In particular, secret information generally
 470 | satisfies at least the following two properties:
 471 | \begin{enumerate}
 472 | \item \emph{The secret must be easily replaceable.} 
 473 |   That is, there should be an efficient algorithm to generate a new secret
 474 |   if the prior one happened to be leaked.
 475 | \item \emph{The secret must be nonextractable.} An adversary who is
 476 |   allowed to query the system should not be able to extract any information
 477 |   about the secret.
 478 | \end{enumerate}
 479 | 
 480 | For example, a defense that includes randomness (chosen fresh) at inference
 481 | time is using secret information not available to the adversary.
 482 | %
 483 | As long as the distribution is known, this follows Kerckhoffs' principle.
 484 | %
 485 | On the other hand, if a single fixed random vector was added to the output
 486 | of the neural network after classifying an input, this would not be a good
 487 | candidate for a secret.
 488 | %
 489 | By subtracting the observed output of the model with
 490 | the expected output, the secret can be easily determined.
 491 | 
 492 | 
 493 | \subsection{Restrict Attacks to the Defense's Threat Model}
 494 | Attack work should always evaluate defenses under the
 495 | threat model the defense states.
 496 | %
 497 | For example, if a defense paper explicitly states
 498 | ``we intend to be robust to $L_2$ attacks of norm no greater than
 499 | 1.5'', an attack paper must restrict its demonstration of vulnerabilities
 500 | in the defense to the generation of adversarial
 501 | examples with $L_2$ norm less than 1.5. Showing something different,
 502 | e.g., adversarial examples with $L_\infty$ norm less than $0.1$,
 503 | is important and useful research%
 504 | \footnote{See for example the work of \cite{sharma2017breaking,song2018generative}
 505 |   who explicitly step outside of the threat model of the original defenses
 506 |   to evaluate their robustness.} (because it teaches the research community
 507 | something that was not previously known, namely, that this system may have
 508 | limited utility in practice), but is not a
 509 | \emph{break} of the defense: the defense never claimed to be robust to
 510 | this type of attack.
 511 | 
 512 | 
 513 | \subsection{Skepticism of Results}
 514 | When performing scientific research one must be skeptical of
 515 | all results.
 516 | %
 517 | As Feynman concisely put it, ``the first principle [of research] is that you
 518 | must not fool yourself---and you are the easiest person to fool.''
 519 | %
 520 | This is never more true than when considering security evaluations.
 521 | %
 522 | After spending significant effort to try and develop a defense
 523 | that is robust against attacks, it is easy to assume that the
 524 | defense is indeed robust, especially when baseline attacks
 525 | fail to break the defense.
 526 | %
 527 | However, at this time the authors need to completely switch
 528 | their frame of mind and try as hard as possible to show their
 529 | proposed defense is ineffective.%
 530 | \footnote{One of the reasons it is so easy to accidentally fool oneself in security
 531 |   is that mistakes are very difficult to catch. Very often attacks only fail
 532 |   because of a (correctable) error in how they are being applied. It has to be the
 533 |   objective of the defense researcher to ensure that, when attacks fail, it is
 534 |   because the defense is correct, and not because of an error in applying
 535 |   the attacks.}
 536 | 
 537 | Adversarial robustness is a negative goal -- for a
 538 | defense to be truly effective, one needs to show that \emph{no attack} can bypass it.
 539 | %
 540 | It is only that by failing to show the defense is ineffective to
 541 | adaptive attacks (see below) that we can
 542 | believe it will withstand future attack by a motivated adversary (or,
 543 | depending on the motivation of the research, that the claimed lower bound is
 544 | in fact an actual lower bound).
 545 | 
 546 | 
 547 | 
 548 | \subsection{Adaptive Adversaries}
 549 | \label{sec:adaptive}
 550 | 
 551 | After a specific threat model has been defined, the remainder of the evaluation
 552 | focuses on \emph{adaptive adversaries}\footnote{We use the word ``adaptive
 553 |   adversary'' (and ``adaptive attack'') to refer to the general notion in
 554 |   security of an adversary (or attack, respectively)
 555 |   that \emph{adapts} to what the defender has done \citep{herley2017sok,carlini2017adversarial}.}
 556 | which are adapted to the specific details of the defense and attempt to invalidate
 557 | the robustness claims that are made.
 558 | 
 559 | This evaluation is the most important section of any paper that develops a
 560 | defense.
 561 | %
 562 | After the defense has been defined, ask: \emph{what attack could possibly defeat this
 563 |   defense?} All attacks that might work must be shown to be ineffective.
 564 | %
 565 | An evaluation that
 566 | does not attempt to do this is fundamentally flawed.
 567 | 
 568 | Just applying existing adversarial attacks with default hyperparameters
 569 | is not sufficient, even if these attacks
 570 | are state-of-the-art: all existing attacks and hyperparameters
 571 | have been adapted to and tested only against 
 572 | \emph{existing} defenses, and there is a good chance these attacks
 573 | will work sub-optimally or even fail against a new defense.
 574 | %
 575 | A typical example is gradient masking~\citep{tramer2017ensemble},
 576 | in which defenses manipulate the model's gradients and thus prevent
 577 | gradient-based attacks from succeeding.
 578 | %
 579 | However, an adversary aware of
 580 | the defense may recover these gradients through a black-box input-label queries, as
 581 | shown by~\citet{papernot2017practical}, or through a different loss
 582 | function, as demonstrated by~\cite{athalye2018obfuscated}.
 583 | %
 584 | In other words, gradient masking may make optimization-based attacks
 585 | fail but that does not mean that the space of adversarial perturbations decreased.
 586 | 
 587 | Defending against non-adaptive attacks is necessary but not sufficient.
 588 | %
 589 | It is our firm belief that \textbf{an evaluation against non-adaptive
 590 |   attacks is of very limited utility}.
 591 | 
 592 | Along the same lines, there is no justification to study a ``zero-knowledge''
 593 | \citep{biggio2013evasion} threat model where the attacker
 594 | is not aware of the defense.
 595 | %
 596 | ``Defending'' against such an adversary is
 597 | an absolute bare-minimum that in no way suggests a defense will be effective
 598 | to further attacks. \cite{carlini2017adversarial} considered this scenario only to demonstrate
 599 | that some defenses were completely ineffective even against this very weak threat model.
 600 | %
 601 | The authors of that work now regret not making this explicit and
 602 | discourage future work from citing this paper in support of the zero-knowledge threat model.
 603 | 
 604 | 
 605 | It is crucial to actively attempt to defeat the specific defense being proposed.
 606 | %
 607 | On the most fundamental level
 608 | this should include a range of sufficiently different attacks with carefully tuned
 609 | hyperparameters.
 610 | %
 611 | But the analysis should go deeper than that:
 612 | ask why the defense might prevent existing attacks
 613 | from working optimally and how to customize existing
 614 | attacks or how to design completely new adversarial attacks to
 615 | perform as well as possible.
 616 | %
 617 | That is, applying the same mindset that a
 618 | future adversary would apply is the only way to show
 619 | that a defense might be able to withstand the test of time.
 620 | 
 621 | These arguments apply independent of the specific motivation of
 622 | the robustness evaluation: security, worst-case bounds
 623 | or human-machine gap all need a sense of the maximum vulnerability
 624 | of a given defense.
 625 | %
 626 | In all scenarios we should assume
 627 | the existence of an ``infinitely thorough'' adversary who will spend whatever time is
 628 | necessary to develop the optimal attack.
 629 | 
 630 | 
 631 | \subsection{Reproducible Research: Code \& Pre-trained Models}
 632 | \label{sec:releasecode}
 633 | 
 634 | Even the most carefully-performed robustness evaluations can have
 635 | subtle but fundamental flaws.
 636 | %
 637 | We strongly believe that releasing full source code and pre-trained models is one of the most
 638 | useful methods for ensuring the eventual correctness of an evaluation.
 639 | %
 640 | Releasing source code makes it much more likely that others will be able to
 641 | perform their own analysis of the defense.\footnote{In their analysis
 642 |   of the ICLR 2018 defenses \citep{athalye2018obfuscated}, the
 643 |   authors spent five times longer re-implementing the defenses than
 644 |   performing the security evaluation of the re-implementations.}
 645 | %
 646 | Furthermore, completely specifying all defense details in a paper can be
 647 | difficult, especially in the typical 8-page limit of many
 648 | conference papers.
 649 | %
 650 | The source code for a defense can be seen as the definitive
 651 | reference for the algorithm.
 652 | 
 653 | It is equally important to release pre-trained models, especially when
 654 | the resources that would be required to train a model would be prohibitive
 655 | to some researchers with limited compute resources.
 656 | %
 657 | The code and model that is released should be the model that was used
 658 | to perform the evaluation in the paper to the extent permitted by
 659 | underlying frameworks for accelerating numerical computations
 660 | performed in machine learning.
 661 | %
 662 | Releasing a \emph{different}
 663 | model than was used in the paper makes it significantly less useful,
 664 | as any comparisons against the paper may not be identical.
 665 | 
 666 | Finally, it is helpful if the released code contains a simple one-line
 667 | script which will run the full defense end-to-end on the given input.
 668 | %
 669 | Note that this is often different than what the defense developers want,
 670 | who often care most about performing the evaluation as efficiently as
 671 | possible.
 672 | %
 673 | In contrast, when getting started with evaluating a defense (or to
 674 | confirm any results), it is
 675 | often most useful to have a simple and correct method for running the
 676 | full defense over an input.
 677 | 
 678 | There are several frameworks such as CleverHans~\citep{papernot2018cleverhans} or Foolbox~\citep{rauber2017foolbox}
 679 | as well as websites\footnote{\url{https://robust-ml.org}}\textsuperscript{,}%
 680 | \footnote{\url{https://robust.vision/benchmark/leaderboard/}}\textsuperscript{,}%
 681 | \footnote{\url{https://foolbox.readthedocs.io/en/latest/modules/zoo.html}}
 682 | which have been developed to assist in this process.
 683 | 
 684 | 
 685 | \section{Specific Recommendations: Evaluation Checklist}
 686 | \label{sec:dont_do_bad_science}
 687 | \label{sec:pleaseactuallythink}
 688 | While the above overview is general-purpose advice we believe will stand the
 689 | test of time, it can be difficult to extract specific, actionable items
 690 | from it.
 691 | %
 692 | To help researchers \emph{today} perform more thorough evaluations,
 693 | we now develop a checklist that lists common evaluation
 694 | pitfalls when evaluating adversarial robustness. Items in this list are
 695 | sorted (roughly) into three categories.
 696 | 
 697 | The items contained below are \textbf{neither necessary nor sufficient}
 698 | for performing a complete adversarial example evaluation, and are
 699 | intended to list common evaluation flaws.
 700 | %
 701 | There likely exist completely ineffective defenses which satisfy all of
 702 | the below recommendations; conversely, some of the strongest defenses
 703 | known today do \emph{not} check off all the boxes below (e.g.\ \citet{madry2017towards}).
 704 | 
 705 | We encourage readers to be extremely careful and \textbf{not directly follow
 706 |   this list} to perform an evaluation or decide if an evaluation that has been
 707 | performed is sufficient.
 708 | %
 709 | Rather, this list contains common flaws that are worth checking for to
 710 | identify potential evaluation flaws.
 711 | %
 712 | Blindly following the checklist without careful thought will likely be counterproductive:
 713 | each item in the list must be taken into consideration within the context
 714 | of the specific defense being evaluated. 
 715 | %
 716 | Each item on the list below is present because we are aware of several defense
 717 | evaluations which were broken and following that specific recommendation would have
 718 | revealed the flaw.
 719 | %
 720 | We hope this list will be taken as a collection of recommendations that may
 721 | or may not apply to a particular defense, but have been useful in the past.
 722 | 
 723 | This checklist is a living document that lists the most common evaluation
 724 | flaws as of \today. We expect the evaluation flaws that are
 725 | common today will \emph{not} be the most common flaws in the future.
 726 | %
 727 | We intend to keep this checklist up-to-date with the latest recommendations
 728 | for evaluating defenses by periodically updating its contents.
 729 | Readers should check the following URL  for the most recent
 730 | revision of the checklist:
 731 | \url{https://github.com/evaluating-adversarial-robustness/adv-eval-paper}.
 732 | 
 733 | 
 734 | 
 735 | \subsection{Common Severe Flaws}
 736 | There are several common severe evaluation flaws which have the
 737 | potential to completely invalidate any robustness claims.
 738 | %
 739 | Any evaluation which contains errors on any of the following
 740 | items is likely to have fundamental and irredeemable flaws.
 741 | %
 742 | Evaluations which intentionally deviate from the advice here may wish to
 743 | justify the decision to do so.
 744 | 
 745 | \begin{itemize}[leftmargin=*]
 746 |   \item \S\ref{sec:pleaseactuallythink} \textbf{Do not mindlessly follow this list}; make sure to still think about the evaluation.
 747 | \item \S\ref{sec:threatmodel} \textbf{State a precise threat model} that the defense is supposed to be effective under.
 748 |   \begin{itemize}[leftmargin=*]
 749 |   \item The threat model assumes the attacker knows how the defense works.
 750 |   \item The threat model states attacker's goals, knowledge and capabilities.
 751 |   \item For security-justified defenses, the threat model realistically models some adversary.
 752 |   \item For worst-case randomized defenses, the threat model captures the perturbation space.
 753 |   \item Think carefully and justify any $\ell_p$ bounds placed on the adversary.
 754 |   \end{itemize}
 755 | \item \S\ref{sec:adaptive} Perform \textbf{adaptive attacks} to give an upper bound of robustness.
 756 |   \begin{itemize}[leftmargin=*]
 757 |   \item The attacks are given access to the full defense, end-to-end.
 758 |   \item The loss function is changed as appropriate to cause misclassification.
 759 |   \item \S\ref{sec:whichattack} \textbf{Focus on the strongest attacks} for the threat model and defense considered.
 760 |   \end{itemize}
 761 | \item \S\ref{sec:releasecode} Release \textbf{pre-trained models and source code}.
 762 |   \begin{itemize}[leftmargin=*]
 763 |   \item Include a clear installation guide, including all dependencies.
 764 |   \item There is a one-line script which will classify an input example with the defense.
 765 |   \end{itemize}
 766 | \item \S\ref{sec:cleanaccuracy} Report \textbf{clean model accuracy} when not under attack.
 767 |   \begin{itemize}[leftmargin=*]
 768 |   \item For defenses that abstain or reject inputs, generate a ROC curve.
 769 |   \end{itemize}
 770 | \item \S\ref{sec:sanitycheck} Perform \textbf{basic sanity tests} on attack success rates.
 771 |   \begin{itemize}[leftmargin=*]
 772 |     \item Verify iterative attacks perform better than single-step attacks.
 773 |     \item Verify that iterative-attacks use sufficient iterations to converge. 
 774 |     \item Verify that attacks use sufficient random restarts to avoid sub-optimal local minima.
 775 |     \item Verify increasing the perturbation budget strictly increases attack success rate.
 776 |     \item With ``high'' distortion, model accuracy should reach levels of random guessing.% or even drop to zero.
 777 |   \end{itemize}
 778 | \item \S\ref{sec:100success} Generate an \textbf{attack success rate vs. perturbation budget} curve.
 779 |   \begin{itemize}[leftmargin=*]
 780 |   \item Verify the x-axis extends so that attacks eventually reach 100\% success.
 781 |   \item For unbounded attacks, report distortion and not success rate.
 782 |   \end{itemize}
 783 | \item \S\ref{sec:whitebox} Verify \textbf{adaptive attacks} perform better than any other.
 784 |   \begin{itemize}[leftmargin=*]
 785 |   \item Compare success rate on a per-example basis, rather than averaged across the dataset.
 786 |   \item Evaluate against some combination of black-box, transfer, and random-noise attacks.
 787 |   \end{itemize}
 788 | \item \S\ref{sec:describeattacks} Describe the \textbf{attacks applied}, including all hyperparameters.
 789 | 
 790 | \end{itemize}
 791 | 
 792 | \subsection{Common Pitfalls}
 793 | There are other common pitfalls that may prevent the detection of ineffective defenses.
 794 | %
 795 | This list contains some potential pitfalls which do not apply to
 796 | large categories of defenses.
 797 | %
 798 | However, if applicable, the items below are still important to carefully
 799 | check they have been applied correctly.
 800 | %
 801 | 
 802 | 
 803 | \begin{itemize}[leftmargin=*]
 804 | \item \S\ref{sec:whichattack} Apply a \textbf{diverse set of attacks} (especially when training on one attack approach).
 805 |   \begin{itemize}[leftmargin=*]
 806 |   \item Do not blindly apply multiple (nearly-identical) attack approaches.
 807 |   \end{itemize}
 808 | \item \S\ref{sec:gradientfree} Try at least one \textbf{gradient-free attack} and one \textbf{hard-label attack}.
 809 |   \begin{itemize}[leftmargin=*]
 810 |   \item Try \cite{chen2017zoo,uesato2018adversarial,ilyas2018black,brendel2017decision}.
 811 |   \item Check that the gradient-free attacks succeed less often than gradient-based attacks.
 812 |   \item Carefully investigate attack hyperparameters that affect success rate.
 813 |   \end{itemize}
 814 | \item \S\ref{sec:transfer} Perform a \textbf{transferability attack} using a similar substitute model.
 815 |   \begin{itemize}[leftmargin=*]
 816 |     \item Select a substitute model as similar to the defended model as possible.
 817 |     \item Generate adversarial examples that are initially assigned high confidence.
 818 |     \item Check that the transfer attack succeeds less often than white-box attacks.
 819 |   \end{itemize}
 820 | \item \S\ref{sec:eot} For randomized defenses, properly \textbf{ensemble over randomness}.
 821 |   \begin{itemize}[leftmargin=*]
 822 |   \item Verify that attacks succeed if randomness is assigned to one fixed value.
 823 |   \item State any assumptions about adversary knowledge of randomness in the threat model.
 824 |   \end{itemize}
 825 | \item \S\ref{sec:bpda} For non-differentiable components, \textbf{apply differentiable techniques}.
 826 |   \begin{itemize}[leftmargin=*]
 827 |   \item Discuss why non-differentiable components were necessary.
 828 |   \item Verify attacks succeed on undefended model with those non-differentiable components.
 829 |   \item Consider applying BPDA~\citep{athalye2018obfuscated} if applicable.
 830 |   \end{itemize}
 831 | \item \S\ref{sec:converge} Verify that the \textbf{attacks have converged} under the selected hyperparameters.
 832 |   \begin{itemize}[leftmargin=*]
 833 |   \item Verify that doubling the number of iterations does not increase attack success rate nor significantly change the adversarial loss.
 834 |   \item Plot attack effectiveness versus the number of iterations.
 835 |   \item Run attacks with multiple random starting points and retain the best one.
 836 |   \item Explore different choices of the step size or other attack hyperparameters.
 837 |   \end{itemize}
 838 | \item \S\ref{sec:hyperparams} Carefully \textbf{investigate attack hyperparameters} and report those selected.
 839 |   \begin{itemize}[leftmargin=*]
 840 |   \item Start search for adversarial examples at a random offset. Try multiple random starting points for each input.
 841 |   \item As for the number of attack iterations, verify that increasing the number of random restarts does not affect the attack's success rate or the adversarial loss.
 842 |   \item Investigate if attack results are sensitive to any other hyperparameters.
 843 |   \end{itemize}
 844 |   \item \S\ref{sec:priorwork} \textbf{Compare against prior work} and explain important differences.
 845 |   \begin{itemize}[leftmargin=*]
 846 |   \item When contradicting prior work, clearly explain why differences occur.
 847 |   \item Attempt attacks that are similar to those that defeated previous similar defenses.
 848 |   \item When comparing against prior work, ensure it has not been broken.
 849 |   \end{itemize}
 850 | \item \S\ref{sec:generalrobustness} Test \textbf{broader threat models} when proposing general defenses. For images:
 851 |   \begin{itemize}[leftmargin=*]
 852 |   \item Apply rotations and translations \citep{engstrom2017rotation}.
 853 |   \item Apply common corruptions and perturbations \citep{hendrycks2018benchmarking}.
 854 |   \item Add Gaussian noise of increasingly large standard deviation \citep{ford2019adversarial}.
 855 |   \end{itemize}
 856 |   
 857 | 
 858 | \end{itemize}
 859 | 
 860 | 
 861 | \subsection{Special-Case Pitfalls}
 862 | The following items apply to a smaller fraction of evaluations.
 863 | %
 864 | Items presented here are included because while
 865 | they may diagnose flaws in
 866 | some defense evaluations, they are not necessary for many others.
 867 | %
 868 | In other cases, the tests presented here help provide additional evidence that the
 869 | evaluation was performed correctly.
 870 | 
 871 | \begin{itemize}[leftmargin=*]
 872 | 
 873 | \item \S\ref{sec:provable} Investigate if it is possible to use \textbf{provable approaches}.
 874 |   \begin{itemize}[leftmargin=*]
 875 |     \item Examine if the model is amenable to provable robustness lower-bounds.
 876 |   \end{itemize}
 877 | \item \S\ref{sec:randomnoise} \textbf{Attack with random noise} of the correct norm.
 878 |   \begin{itemize}[leftmargin=*]
 879 |     \item For each example, try 10,000+ different choices of random noise.
 880 |     \item Check that the random attacks succeed less-often than white-box attacks.
 881 |   \end{itemize}
 882 | \item \S\ref{sec:targeted} Use both \textbf{targeted and untargeted attacks} during evaluation.
 883 |   \begin{itemize}[leftmargin=*]
 884 |     \item State explicitly which attack type is being used.
 885 |   \end{itemize}
 886 | \item \S\ref{sec:attacksimilar} \textbf{Perform ablation studies} with combinations of defense components removed.
 887 |   \begin{itemize}[leftmargin=*]
 888 |   \item Attack a similar-but-undefended model and verify attacks succeed.
 889 |   \item If combining multiple defense techniques, argue why they combine usefully.
 890 |   \end{itemize}
 891 | \item \S\ref{sec:benchmarkattack} \textbf{Validate any new attacks} by attacking other defenses.
 892 |   \begin{itemize}[leftmargin=*]
 893 |   \item Attack other defenses known to be broken and verify the attack succeeds.
 894 |   \item Construct synthetic intentionally-broken models and verify the attack succeeds.
 895 |   \item Release source code for any new attacks implemented.
 896 |   \end{itemize}
 897 | \item \S\ref{sec:notimages} Investigate applying the defense to \textbf{domains other than images}.
 898 |   \begin{itemize}[leftmargin=*]
 899 |   \item State explicitly if the defense applies only to images (or another domain).
 900 |   \end{itemize}
 901 |   \item \mbox{\S\ref{sec:reportmeanmin} Report \textbf{per-example attack success rate}:
 902 | $\mathop{\text{mean}}\limits_{x \in \mathcal{X}} \min\limits_{a \in \mathcal{A}} f(a(x))$, not
 903 | $\mathop{\text{min}}\limits_{a \in \mathcal{A}} \mathop{\text{mean}}\limits_{x \in \mathcal{X}} f(a(x))$.}
 904 | \end{itemize}
 905 | 
 906 | \section{Evaluation Recommendations}
 907 | 
 908 | We now expand on the above checklist and provide the rationale for each item.
 909 | 
 910 | 
 911 | \subsection{Investigate Provable Approaches}
 912 | \label{sec:provable}
 913 | 
 914 | With the exception of this subsection, all other advice in this
 915 | paper focuses on performing heuristic robustness evaluations.
 916 | %
 917 | Provable robustness approaches are preferable to only heuristic
 918 | ones.
 919 | %
 920 | Current provable approaches often can only be applied when the neural
 921 | network is explicitly designed with the objective of making these specific
 922 | provable techniques applicable \citep{kolter2017provable,raghunathan2018certified,WengZCSHBDD18}.
 923 | %
 924 | While this approach of designing-for-provability has seen excellent
 925 | progress---the best approaches today can certify some (small) robustness
 926 | even on ImageNet classifiers \citep{lecuyer2018certified}---often the best
 927 | heuristic defenses offer orders of magnitude better (estimated) robustness.
 928 | 
 929 | Proving a lower bound of defense
 930 | robustness guarantees that the robustness will never fall
 931 | below that level (if the proof is correct).
 932 | %
 933 | We believe an important direction of future research is developing approaches
 934 | that can generally prove arbitrary neural networks correct.
 935 | %
 936 | While work
 937 | in this space does exist \citep{katz2017reluplex,TjengXT19,XiaoTSM19,GowalDSBQUAMK18}, it is often computationally
 938 | intractable to verify even modestly sized neural networks.
 939 | 
 940 | One key limitation of provable techniques is that the proofs they
 941 | offer are generally only of the form ``for some \emph{specific} set of
 942 | examples $\mathcal{X}$, no adversarial example with distortion less than
 943 | $\varepsilon$ exists''.
 944 | %
 945 | While this is definitely a useful statement, it
 946 | gives no proof about any \emph{other} example $x' \not\in \mathcal{X}$;
 947 | and because this is the property that we actually care about, provable
 948 | techniques are still not provably correct in the same way that provably
 949 | correct cryptographic algorithms are provably correct.
 950 | 
 951 | 
 952 | \subsection{Report Clean Model Accuracy}
 953 | \label{sec:cleanaccuracy}
 954 | 
 955 | A defense that significantly degrades the model's accuracy on the original
 956 | task (the \emph{clean} or \emph{natural} data) may not be useful
 957 | in many situations.
 958 | %
 959 | If the probability of an actual attack is very low and the cost of an 
 960 | error on adversarial inputs is not
 961 | high, then it may be unacceptable to incur \emph{any} decrease in clean
 962 | accuracy.
 963 | %
 964 | Often there can be a difference in the impact of an error on a random
 965 | input and an error on an adversarially chosen input. To what extent this
 966 | is the case depends on the domain the system is being used in.
 967 | 
 968 | For the class of defenses that refuse to classify inputs by abstaining when
 969 | detecting that inputs are adversarial, or otherwise refuse to classify
 970 | some inputs, it is important to evaluate how this impacts accuracy on
 971 | the clean data.
 972 | %
 973 | Further, in some settings it may be acceptable to refuse to classify inputs
 974 | that have significant amount of noise. In others, while it may be acceptable
 975 | to refuse to classify adversarial examples, simple noisy inputs must still
 976 | be classified correctly.
 977 | %
 978 | It can be helpful to generate a Receiver Operating Characteristic (ROC) curve
 979 | to show how the choice of threshold for rejecting inputs causes the clean
 980 | accuracy to decrease.
 981 | 
 982 | 
 983 | \subsection{Focus on the Strongest Attacks Possible}
 984 | \label{sec:whichattack}
 985 | 
 986 | \paragraph{Use optimization-based attacks.}
 987 | Of the many different attack algorithms, optimization-based attacks
 988 | are by far the most powerful.
 989 | %
 990 | After all, they extract a significant amount of information from the model
 991 | by utilizing the gradients of some loss function and not just the predicted output.
 992 | %
 993 | In a white-box setting, there are many different attacks
 994 | that have been created, and picking almost any of them will be useful.
 995 | %
 996 | However, it is important to \emph{not} just choose an attack and apply it out-of-the-box
 997 | without modification.
 998 | %
 999 | Rather, these attacks should serve as a starting point
1000 | to which defense-specific knowledge can be applied.
1001 | 
1002 | We have found the following three attacks useful starting points for constructing
1003 | adversarial examples under different distance metrics:
1004 | \begin{itemize}
1005 |   \item For $\ell_1$ distortion, start with \cite{chen2017ead}.
1006 |   \item For $\ell_2$ distortion, start with \cite{carlini2017towards}.
1007 |   \item For $\ell_\infty$ distortion, start with \cite{madry2017towards}.
1008 | \end{itemize}
1009 | 
1010 | Note, however, that these attacks were designed to be effective on standard
1011 | neural networks; any defense which modifies the architecture, training
1012 | process, or any other aspect of the machine learning algorithm is likely to affect their performance.
1013 | %
1014 | In order to ensure that they perform reliably on a particular defense,
1015 | a certain amount of critical thinking and hyper-parameter optimization is necessary.
1016 | %
1017 | Also, remember that there is
1018 | no universally \emph{strongest} attack: each attack makes specific assumptions
1019 | about the model and so an attack that is best on one model might perform
1020 | much worse on another \citep{schott2018}.
1021 | %
1022 | It is worth considering many different attacks.
1023 | 
1024 | \paragraph{Do not use Fast Gradient Sign (exclusively).}
1025 | The Fast Gradient Sign
1026 | (FGS) attack is a simple approach to generating adversarial examples
1027 | that was proposed to demonstrate the linearity of neural networks \citep{goodfellow2014explaining}.
1028 | %
1029 | It was never intended to be a strong attack for evaluating the robustness
1030 | of a neural network, and should not be used as one.
1031 | 
1032 | Even if it were intended to be a strong attack worth defending
1033 | against, there are many defenses which achieve near-perfect success at
1034 | defending against this attack.
1035 | %
1036 | For example, a weak version of
1037 | adversarial training can defend against this attack by causing
1038 | gradient masking  \citep{tramer2017ensemble},
1039 | where locally the gradient around a given image may
1040 | point in a direction that is not useful for generating an adversarial
1041 | example.
1042 | 
1043 | While evaluating against FGS can be a component of an attack
1044 | evaluation, it should never be used as the only attack in an
1045 | evaluation.
1046 | 
1047 | At the same time, even relatively simple and efficient attacks
1048 | (DeepFool and JSMA) \citep{moosavi2016deepfool,papernot2016limitations}
1049 | can still be useful for evaluating adversarial
1050 | robustness if applied correctly \citep{jetley2018friends,carlini2016defensive}.
1051 | %
1052 | However doing so often requires more care and is more error-prone;
1053 | applying gradient-based attacks for many iterations is often simpler
1054 | despite the attacks themselves being slower.
1055 | 
1056 | 
1057 | \paragraph{Do not \emph{only} use attacks during testing that were used during training.}
1058 | One pitfall that can arise with adversarial training is that the defense
1059 | can overfit against one particular attack
1060 | used during training (e.g. by masking the gradients)  \citep{tramer2017ensemble}.
1061 | %
1062 | Using the same attack during both training and testing is dangerous
1063 | and is likely to overestimate the robustness of the defense.
1064 | %
1065 | This is still true even if the attack is perceived as the
1066 | strongest attack for the given
1067 | threat model: this is probably not true any more given that the defense
1068 | has been specifically tuned against it.
1069 | 
1070 | \paragraph{Applying many nearly-identical attacks is not useful.}
1071 | When applying a diverse set of attacks, it is critical that the attacks
1072 | are actually diverse.
1073 | %
1074 | For example, the Basic Iterative Method (BIM) (also
1075 | called i-FGSM, iterated fast gradient sign) \citep{kurakin2016adversarial}
1076 | is nearly identical to Projected Gradient Descent \citep{madry2017towards}
1077 | modulo the initial random step.
1078 | %
1079 | Applying both of these attacks is less useful than applying one of these
1080 | attacks and another (different) attack approach.
1081 | 
1082 | \subsection{Apply Gradient-Free Attacks}
1083 | \label{sec:gradientfree}
1084 | 
1085 | To ensure that the model is not causing
1086 | various forms of gradient masking, it is worth attempting
1087 | gradient-free attacks.
1088 | %
1089 | There are several such proposals. The following
1090 | three require access to model confidence values, and therefore
1091 | some forms of gradient masking (e.g., by thresholding the model
1092 | outputs) will prevent these attacks from working effectively.
1093 | %
1094 | Nevertheless, these attacks are effective under many situations
1095 | where standard gradient-based attacks fail:
1096 | \begin{itemize}
1097 |   \item ZOO \citep{chen2017zoo} (and also \citep{LiuCLS17,bhagoji2018practical,TuTCLZYHC18}) numerically estimates gradients and then
1098 | performs gradient descent, making it powerful but potentially
1099 | ineffective when the loss surface is difficult to optimize over.
1100 |   \item SPSA \citep{uesato2018adversarial} was proposed
1101 | specifically to evaluate adversarial example defenses, and has broken
1102 | many. It can operate even when the loss surface is difficult to
1103 | optimize over, but has many hyperparameters that can be difficult to tune.
1104 |   \item NES \citep{ilyas2018black} is effective at generating adversarial examples with a
1105 | limited number of queries, and so generates adversarial examples of
1106 | higher distortion. One variant of NES can handle scenarios in which only the confidence
1107 | values of the top-$k$ classes or only the label corresponding to the most confident output (see below)
1108 | are observed.
1109 | This family of approaches can be further strengthened by including and transferring over the existing priors \citep{IlyasEM18}.
1110 | \end{itemize}
1111 | 
1112 | Hard label (or ``decision-based'') attacks differ from gradient-free confidence-based attacks
1113 | in that they only require access to the \texttt{arg max} output of the
1114 | model (i.e., the label corresponding to the most confident output).
1115 | This makes these attacks much slower, as they make many more queries,
1116 | but defenses can do less to accidentally prevent these attacks.
1117 | 
1118 | \begin{itemize}
1119 |   \item The Boundary Attack \citep{brendel2017decision} is general-purpose
1120 |   hard-label attack and performs a descent along the decision boundary using a rejection sampling approach.
1121 |   In terms of the minimum adversarial distance the attack often rivals the best white-box attacks 
1122 |   but at the cost of  many more queries.
1123 | \end{itemize}
1124 | 
1125 | The most important reason to apply gradient-free attacks is as a test
1126 | for gradient masking \citep{tramer2017ensemble,athalye2018obfuscated,uesato2018adversarial}.
1127 | %
1128 | Gradient-free attacks should almost always do worse than gradient-based
1129 | attacks (see \S\ref{sec:sanitycheck}).
1130 | %
1131 | Not only
1132 | should this be true when averaged across the entire dataset, but
1133 | also on a per-example basis (see \S\ref{sec:reportmeanmin}) there
1134 | should be very few instances where gradient-based attacks fail but
1135 | gradient-free attacks succeed.
1136 | %
1137 | Thus, white-box attacks performing
1138 | worse than gradient-free attacks is strong evidence for gradient masking.
1139 | 
1140 | Just as gradient masking can fool gradient-based attack algorithms,
1141 | gradient-free attack algorithms can be similarly fooled by different
1142 | styles of defenses.
1143 | %
1144 | Today, the most common class of models that causes existing
1145 | gradient-free attacks to fail is \emph{randomized models}.
1146 | %
1147 | This may change in the future as gradient-free attacks become stronger,
1148 | but the state-of-the-art gradient-free attacks currently do not do well in this setting.
1149 | %
1150 | (Future work on this challenging problem would be worthwhile.)
1151 | 
1152 | \subsection{Perform a Transferability Analysis}
1153 | \label{sec:transfer}
1154 | 
1155 | Adversarial examples often
1156 | transfer between models \citep{papernot2016transferability}.
1157 | %
1158 | That is, an adversarial example generated
1159 | against one model with a specific architecture will
1160 | often also be adversarial on another model, even if the latter is trained on a
1161 | different training set with a different architecture.
1162 | 
1163 | By utilizing this phenomenon, \emph{transfer attacks} can be performed by considering
1164 | an alternative, substitute model and generating adversarial examples for that are strongly
1165 | classified as a wrong class by that model.
1166 | %
1167 | These examples can then be used to attack the target model.
1168 | %
1169 | Such transfer attacks are particularly successful at circumventing
1170 | defenses that rely on gradient masking.
1171 | %
1172 | Since the adversarial example is generated on an independent model, there is no
1173 | need to directly optimize over the (often poorly behaved) landscape of the defended model.
1174 | %
1175 | An adversarial example defense evaluation should thus attempt a
1176 | transferability analysis to validate that the proposed
1177 | defense breaks transferability.
1178 | %
1179 | If it does not, then it is likely to
1180 | not actually be a strong defense, but just appears like one.
1181 | 
1182 | 
1183 | What should be the source model for a transferability analysis? A good
1184 | starting point is a model as similar to the defended model as
1185 | possible, trained on the same training data.
1186 | %
1187 | If the defended model
1188 | adds some new layers to a baseline model, it would be good to try the
1189 | baseline.
1190 | %
1191 | Using the undefended baseline allows optimization-based attacks to
1192 | reliably construct high-confidence adversarial examples that are likely to
1193 | transfer to the defended model.
1194 | %
1195 | We also recommend trying a strong adversarially
1196 | trained model \citep{madry2017towards}, which has been shown to be a strong source model for
1197 | adversarial examples.
1198 | %
1199 | It is also more effective to generate
1200 | adversarial examples that fool an ensemble of source models, and then
1201 | use those to transfer to a target model \citep{liu2016delving}.
1202 | 
1203 | \subsection{Properly Ensemble over Randomness}
1204 | \label{sec:eot}
1205 | 
1206 | For defenses that randomize aspects of neural network inference,
1207 | it is important to properly generate adversarial examples by
1208 | ensembling over the randomness of the defense.
1209 | %
1210 | The randomness introduced might make it difficult to apply standard attacks
1211 | since the output of the classifier as well the loss gradients used for
1212 | optimization-based attack now become stochastic.
1213 | %
1214 | It is often necessary to repeat each step of a standard attack multiple times
1215 | to obtain reliable estimates.
1216 | %
1217 | Thus, by considering multiple different choices of randomness, one can generate
1218 | adversarial examples that are still for a new choice of randomness.
1219 | 
1220 | Defenses should be careful that relying on an exponentially large
1221 | randomness spaces may not actually make attacks exponentially
1222 | more difficult to attack.
1223 | %
1224 | It is often the case that one can construct adversarial examples that are resistant
1225 | to such randomness by simply constructing examples that are consistently
1226 | adversarial over a moderate number of randomness choices.
1227 | 
1228 | \paragraph{Verify that attacks succeed if randomness is fixed.}
1229 | %
1230 | If randomness is believed to be an important reason why the defense
1231 | is effective, it can be useful to sample one value of randomness and
1232 | use only that one random choice.
1233 | %
1234 | If the attacks fail even when the randomness is disabled in this
1235 | way, it is likely that the attack is not working correctly and
1236 | should be fixed.
1237 | %
1238 | Once the attack succeeds with randomness
1239 | completely disabled, it can be helpful to slowly re-enable the
1240 | randomness and verify that attack success rate begins to decrease.
1241 | 
1242 | \subsection{Approximate Non-Differentiable Layers}
1243 | \label{sec:bpda}
1244 | 
1245 | Some defenses include non-differentiable layers as pre-processing
1246 | layers, or later modify internal layers to make them non-differentiable
1247 | (e.g., by performing quantization or adding extra randomness).
1248 | %
1249 | Doing this makes the defense much harder to evaluate completely:
1250 | gradient masking becomes much more likely when some layers are
1251 | non-differentiable, or have difficult-to-compute gradients.
1252 | 
1253 | In many cases, it can happen that one accidentally introduces non-differentiable
1254 | layers \citep{athalye2018obfuscated} (e.g., by introducing components
1255 | that, while differentiable, are not usefully-differentiable) or layers that
1256 | cause gradients to vanish to zero (e.g., by saturating
1257 | activation functions \citep{brendel2017comment,carlini2016defensive}).
1258 | 
1259 | In these cases it can be useful to create a
1260 | differentiable implementation of the layer (if possible) to
1261 | obtain gradient information.
1262 | %
1263 | In other case, applying BPDA \citep{athalye2018obfuscated} can help
1264 | with non-differentiable layers.
1265 | %
1266 | For example, if a defense implements a pre-processing layer that
1267 | denoises the input and has the property that $\text{denoise}(x) \approx x$,
1268 | it can be effective to approximate its gradient as the identity
1269 | function on the backward pass, but compute the function exactly
1270 | on the forward pass.
1271 | 
1272 | 
1273 | \subsection{Verify Attack Convergence}
1274 | \label{sec:converge}
1275 | 
1276 | On at least a few inputs, confirm
1277 | that the number of iterations of gradient descent being performed is
1278 | sufficient by plotting the attack success and adversarial loss versus the number of
1279 | iterations.
1280 | %
1281 | These plots should eventually plateau; if not, the attack has not converged and more 
1282 | iterations can be used until it does. Plotting the adversarial loss is often more useful 
1283 | than plotting the attack success rate, as the former has a finer granularity and may reveal 
1284 | that the attack is still making progress even though the success rate seems to have plateaued.
1285 | The number of iterations necessary is generally inversely proportional to step size and
1286 | proportional to distortion allowed.
1287 | %
1288 | Dataset complexity is often related.
1289 | 
1290 | For example, on
1291 | CIFAR-10 or ImageNet with a maximum $\ell_\infty$ distortion of 8/255,
1292 | white-box optimization attacks generally converge in under 100 or 1000
1293 | iterations with a step size of 1.
1294 | %
1295 | However, black-box attacks often take orders of magnitude more queries,
1296 | with attacks requiring over 100,000 queries not being abnormal.
1297 | %
1298 | For a different dataset, or a different
1299 | distortion metric, attacks will require a different number of iterations.
1300 | %
1301 | While it is
1302 | possible to perform an attack correctly with very few iterations
1303 | of gradient descent, it requires much more thought and care \citep{engstrom2018evaluating}.
1304 | %
1305 | For example, in some cases even a million iterations of white-box gradient descent
1306 | have been found necessary \citep{qian2018l2}.
1307 | 
1308 | There are few reasonable threat models under which an attacker can
1309 | compute 100 iterations of gradient descent, but not 1000. Similarly, as stated 
1310 | in~\cite{athalye2018obfuscated}, increasing the time it takes an adversary to find an adversarial 
1311 | example from one second to ten seconds does typically not constitute an increase in robustness. 
1312 | Of course, if a future defense manages to provide strong (and preferably provable) lower bounds 
1313 | on the  computational cost of an attack, this might still be interesting if these lower bounds are 
1314 | prohibitively large (e.g., the attack requires at least $2^{30}$ gradient evaluations).
1315 | %
1316 | In general, because the threat model must not constrain the approach an
1317 | attacker takes, it is worth evaluating against a strong attack
1318 | with many iterations of gradient descent, as well as many different random restarts 
1319 | (to avoid converging to sub-optimal local minima~\citep{madry2017towards,mosbach2018logit}).
1320 | 
1321 | \paragraph{Ensure doubling attack iterations does not increase attack success rate or adversarial loss.}
1322 | %
1323 | Because choosing any fixed bounded number of iterations can be difficult to
1324 | generalize, one simple and useful method for determining if ``enough'' iterations
1325 | have been performed is to try doubling the number of iterations and check if this
1326 | improves on the adversarial examples generated.
1327 | %
1328 | This single test is not the only method to select the number of iterations,
1329 | but once some value has been selected, doubling the number of iterations
1330 | can be a useful test that the number selected is probably sufficient.
1331 | 
1332 | 
1333 | 
1334 | \subsection{Carefully Investigate Attack Hyperparmeters}
1335 | \label{sec:hyperparams}
1336 | 
1337 | 
1338 | \paragraph{Start search from random offsets.}
1339 | %
1340 | When performing iterative attacks,
1341 | it is useful to begin the search for adversarial examples at random
1342 | offsets away from the initial, source input.
1343 | %
1344 | This can reduce the
1345 | distortion required to generate adversarial examples by 10\%
1346 | \citep{carlini2017towards}, and help avoid causing gradient masking
1347 | \citep{tramer2017ensemble,madry2017towards}.
1348 | 
1349 | Moreover, even if the attack uses sufficient iterations to converge, repeating the attack with 
1350 | multiple randomly chosen starting points decreases the likelihood of ending up in a 
1351 | sub-optimal local minima~\citep{madry2017towards,mosbach2018logit}.
1352 | %
1353 | Doing this
1354 | every time is not necessary, but doing it at least once to verify it
1355 | does not significantly change results is worthwhile.
1356 | %
1357 | Prior work has found that using at least 10 random starting points is effective.
1358 | 
1359 | As for verifying attack convergence, a good test would be to check that doubling the
1360 | number of random restarts (e.g., from $10$ to $20$) does not affect the attack's 
1361 | success rate and adversarial loss.
1362 | 
1363 | \paragraph{Carefully select attack hyperparameters.}
1364 | %
1365 | Many attacks support a wide range of hyperparameters.
1366 | %
1367 | Different settings of these parameters
1368 | can make order-of-magnitude differences in attack success rates.
1369 | %
1370 | Fortunately, most hyperparameters have the property that moving them in
1371 | one direction strictly increases attack power (e.g., more iterations of
1372 | gradient descent is typically better).
1373 | %
1374 | When in doubt, chose a stronger set of hyperparameters.
1375 | 
1376 | All hyperparameters and the way in which they were determined should be reported.
1377 | %
1378 | This allows others to reproduce the results and helps to
1379 | understand how much the attacks
1380 | have been adapted to the defense.
1381 | 
1382 | \subsection{Test General Robustness for General-Purpose Defenses}
1383 | \label{sec:generalrobustness}
1384 | Some defenses are designed not to be robust against any one specific
1385 | type of attack, but to generally \emph{be robust}.
1386 | %
1387 | This is in contrast to other defenses, for example adversarial training
1388 | \citep{goodfellow2014explaining,madry2017towards}, which explicitly set out
1389 | to be robust against one specific threat model (e.g., $l_\infty$ adversarial
1390 | robustness).
1391 | 
1392 | When a defense is arguing that it generally improves robustness, it can
1393 | help to also verify this fact with easy-to-apply attacks.
1394 | %
1395 | Work in this space exists mainly on images; below we give three
1396 | good places to start, although they are by no means the only
1397 | that are worth applying.
1398 | %
1399 | \begin{itemize}
1400 | \item Transform the image with a random rotation and translation
1401 |   \citep{engstrom2017rotation}. This attack can be performed
1402 |   brute-force and is thus not susceptible to gradient masking.
1403 | \item Apply common corruptions and perturbations
1404 |   \citep{hendrycks2018benchmarking} that mimic changes that may actually
1405 |   occur in practice.
1406 | \item Add Gaussian noise with increasingly large standard deviation
1407 |   \citep{ford2019adversarial}. Adversarially robust models tend to be more resistant to random noise compared to their standard counterparts
1408 |   \citep{fawzi2016robustness}.
1409 | \end{itemize}
1410 | In all cases, if these tests are used as a method for evaluating general-purpose
1411 | robustness it is important to \emph{not train on them directly}: doing so
1412 | would counter their intended purpose.
1413 | 
1414 | \subsection{Try Brute-Force (Random) Search Attacks}
1415 | \label{sec:randomnoise}
1416 | 
1417 | A very simple sanity check to ensure that the attacks have not
1418 | been fooled by the defense is trying random search to
1419 | generate adversarial examples within the threat model.
1420 | %
1421 | If brute-force random sampling
1422 | identifies adversarial examples that other methods haven't found, this
1423 | indicates that other attacks can be improved.
1424 | 
1425 | We recommend starting by sampling random points at a large distance from the original input.
1426 | %
1427 | Every time
1428 | an adversarial example is found, limit the search to adversarial
1429 | examples of strictly smaller distortion.
1430 | %
1431 | We recommend verifying a
1432 | hundred instances or so with a few hundred thousand random samples.
1433 | 
1434 | \subsection{Targeted and Untargeted Attacks}
1435 | \label{sec:targeted}
1436 | In theory, an untargeted attack is strictly easier than a targeted attack.
1437 | %
1438 | However, in practice, there can be cases where targeting any of the $N-1$ classes
1439 | will give superior results to performing one untargeted attack.
1440 | 
1441 | At the implementation level, many untargeted attacks work by \emph{reducing}
1442 | the confidence in the correct prediction, while targeted attacks work by
1443 | \emph{increasing} the confidence in some other prediction.
1444 | %
1445 | Because these two formulations are not directly inverses of each other,
1446 | trying both can be helpful.
1447 | 
1448 | \subsection{Attack Similar-but-Undefended Models}
1449 | \label{sec:attacksimilar}
1450 | 
1451 | Defenses typically are implemented by making a series of changes
1452 | to a base model.
1453 | %
1454 | Some changes are introduced in order to increase
1455 | the robustness, but typically other changes are also introduced to
1456 | counteract some unintended consequence of adding the defense components.
1457 | %
1458 | In these cases, it can be useful to remove all of the defense
1459 | components from the defended model and attack a model with only the
1460 | added non-defense components remaining.
1461 | %
1462 | If the model still appears robust with these components not intended
1463 | to provide security, it is likely the attack is being artificially
1464 | broken by those non-defense components.
1465 | 
1466 | Similarly, if the defense has some tunable constants where changing
1467 | (e.g., by increasing)
1468 | the constant is believed to make generating adversarial examples
1469 | harder, it is important to show that when the constant is not
1470 | correctly set (e.g., by decreasing it) the model is vulnerable to
1471 | attack.
1472 | 
1473 | \subsection{Validate any new attack algorithms introduced}
1474 | \label{sec:benchmarkattack}
1475 | 
1476 | Often times it can be necessary to introduce a new attack algorithm
1477 | tailored to some specific defense.
1478 | %
1479 | When doing so, it is important
1480 | to carefully evaluate the effectiveness of this new attack algorithm.
1481 | %
1482 | It is unfortunately
1483 | easy to design an attack that will never be effective, regardless
1484 | of the model it is attacking.
1485 | 
1486 | Therefore, when designing a new attack, it is important to validate
1487 | that it is indeed effective.
1488 | %
1489 | This can be done by selecting alternate
1490 | models that are either known to be insecure or are intentionally
1491 | designed to be insecure, and verify that the new attack algorithm
1492 | can effectively break these defenses.
1493 | 
1494 | 
1495 | \section{Analysis Recommendations}
1496 | \label{sec:analysis}
1497 | 
1498 | After having performed an evaluation (consisting of at least 
1499 | some of the above
1500 | steps), there are several simple checks that will help
1501 | identify potential flaws in the attacks that should be corrected.
1502 | 
1503 | \subsection{Compare Against Prior Work}
1504 | \label{sec:priorwork}
1505 | 
1506 | Given the extremely large quantity of defense work in the space of
1507 | adversarial examples, it is highly unlikely that any idea is completely
1508 | new and unrelated to any prior defense.
1509 | %
1510 | Although it can be time-consuming, it is important to review prior
1511 | work and look for approaches that are similar to the new defense
1512 | being proposed.
1513 | 
1514 | This is especially important in security because any attacks which
1515 | were effective on prior similar defenses are likely to still
1516 | be effective.
1517 | %
1518 | It is therefore even more important to review not
1519 | only the prior defense work, but also the prior attack work to ensure
1520 | that all known attack approaches have been considered.
1521 | %
1522 | An unfortunately large number of defenses have been defeated by
1523 | applying existing attacks unmodified.
1524 | 
1525 | \paragraph{Compare against true results.}
1526 | %
1527 | When comparing against prior
1528 | work, it is important to report the accuracy of prior defenses under
1529 | the strongest attacks on these models.%
1530 | If a defense claimed that its
1531 | accuracy was $99\%$ but followup work reduced its accuracy to $80\%$,
1532 | future work should report the accuracy at $80\%$, and \textbf{not} $99\%$.
1533 | %
1534 | The original result is wrong.
1535 | 
1536 | There are many examples of defenses building on prior work which
1537 | has since been shown to be completely broken; performing a literature
1538 | review can help avoid situations of this nature. Websites such
1539 | as \texttt{RobustML}%
1540 | \footnote{\url{https://www.robust-ml.org/}},
1541 | are explicitly designed to help track defense progress.
1542 | 
1543 | \subsection{Perform Basic Sanity Tests}
1544 | \label{sec:sanitycheck}
1545 | 
1546 | \paragraph{Verify iterative attacks perform better than single-step attacks.}
1547 | Iterative attacks are strictly more powerful than single-step
1548 | attacks, and so their results should be strictly superior.
1549 | %
1550 | If an
1551 | iterative attack performs worse than a single-step attack, this often
1552 | indicates that the iterative attack is not correct.
1553 | 
1554 | If this is the case, one useful diagnostic test is to plot attack
1555 | success rate versus number of attack iterations averaged over many
1556 | attempts, to try and identify if there is a pattern.
1557 | %
1558 | Another diagnostic is to plot the model loss versus the number of attack
1559 | iterations for a single input.
1560 | %
1561 | The model loss should be (mostly) decreasing at each iteration.
1562 | %
1563 | If this is not the case, a smaller step size may be helpful.
1564 | 
1565 | Another issue could be that the iterative attack is not converging properly.  
1566 | A useful test is to verify that increasing the number of iterations or the 
1567 | number of random restarts have only marginal effect on the attack success 
1568 | rate and model loss.
1569 | 
1570 | \paragraph{Verify increasing the perturbation budget strictly
1571 | increases attack success rate.}
1572 | \label{sec:monodistortion}
1573 | Attacks that allow more distortion are strictly stronger
1574 | than attacks that allow less distortion.
1575 | %
1576 | If the attack success rate
1577 | ever decreases as the amount of distortion allowed increases, the
1578 | attack is likely flawed.
1579 | 
1580 | \paragraph{With ``high'' distortion, model accuracy should reach
1581 |   levels of random guessing.}
1582 | %
1583 | Exactly what ``high'' means will depend on the dataset.
1584 | %
1585 | On some datasets (like MNIST) even with noise bounded by an
1586 | $\ell_\infty$ norm of $0.3$, humans can often determine what
1587 | the digit was.
1588 | %
1589 | However, on CIFAR-10, noise with an $\ell_\infty$ norm of $0.2$ makes most
1590 | images completely unrecognizable.
1591 | 
1592 | Regardless of the dataset, there are some accuracy-vs-distortion
1593 | numbers that are theoretically impossible.
1594 | %
1595 | For example, it is not
1596 | possible to do better than random guessing with a $\ell_\infty$
1597 | distortion of $0.5$: any image can be converted into a solid gray
1598 | picture.
1599 | %
1600 | Or, for MNIST, the median $L_2$ distortion between a
1601 | digit and the most similar other image with a different label
1602 | is less than $9$, so claiming higher than this is not feasible.
1603 | 
1604 | \subsection{Generate An Accuracy versus Perturbation Curve}
1605 | \label{sec:100success}
1606 | 
1607 | One of the most useful diagnostic curves to generate is an accuracy versus perturbation budget
1608 | curve, along with an attack success rate versus perturbation budget curve.
1609 | %
1610 | These curves can help perform many of the sanity tests discussed in \S\ref{sec:sanitycheck}.
1611 | 
1612 | For some attack which produces minimally-distorted adversarial
1613 | examples~\citep{carlini2017towards} (as opposed to maximizing loss
1614 | under some norm constraint \citep{madry2017towards})
1615 | generating these curves is computationally efficient
1616 | (adversarial examples can be sorted by distance and then generating these curve
1617 | can be accomplished by sweeping a threshold constant).
1618 | %
1619 | For attacks that maximize loss given a fixed budget \citep{madry2017towards},
1620 | generating this curve naively requires calling the attack once for each value
1621 | of the perturbation budget, but this can be made more efficient by performing
1622 | binary search on the budget on a per-example basis.
1623 | 
1624 | 
1625 | \paragraph{Perform an unbounded attack.}
1626 | With unbounded
1627 | distortion, any attack should eventually reach 100\% success, even if only
1628 | by switching the input to actually be an instance of the other class.
1629 | %
1630 | If unbounded attacks do not succeed, this indicates that the attack is
1631 | being applied incorrectly.
1632 | %
1633 | The curve generated should have an x-axis with sufficient perturbation
1634 | so that it reaches 100\% attack success (0\% model accuracy).
1635 | 
1636 | \paragraph{For unbounded attacks, measure distortion, not success rate.}
1637 | %
1638 | The correct metric for evaluating unbounded attacks is the distortion
1639 | required to generate an adversarial example, not the success rate
1640 | (which should always be 100\%).
1641 | %
1642 | The most useful plot is to show success rate (or model accuracy) vs. distortion,
1643 | and the most useful single number is
1644 | either the mean or median distance to adversarial examples, when using
1645 | unbounded attacks.
1646 | %
1647 | To make measuring success rate meaningful, another
1648 | option is to artificially bound the attack and report any adversarial
1649 | example of distance greater than some threshold a failure (as long as
1650 | this threshold is stated).
1651 | 
1652 | \subsection{Verify white-box attacks perform better than black-box attacks}
1653 | \label{sec:whitebox}
1654 | Because white-box attacks are a strict super-set of black-box
1655 | attacks, they should perform strictly better.
1656 | %
1657 | In particular, this implies that gradient-based attacks should, in principle, outperform gradient-free attacks.
1658 | %
1659 | Gradient-free attacks doing
1660 | better often indicates that the defense is somehow masking gradient
1661 | information and that the gradient-based attack could be
1662 | improved to succeed more often.
1663 | 
1664 | When this happens, look at instances for which adversarial
1665 | examples can be found with black-box but not white-box attacks.
1666 | %
1667 | Check if these instances are at all related, and investigate why white-box
1668 | attacks are not finding these adversarial examples.
1669 | 
1670 | \subsection{Investigate Domains Other Than Images}
1671 | \label{sec:notimages}
1672 | 
1673 | Adversarial examples are not just a problem for image classification,
1674 | they are also a problem on sentiment analysis, translation,
1675 | generative models, reinforcement learning, audio classification,
1676 | and segmentation analysis (among others).
1677 | %
1678 | If a defense is limited to one domain, it should state this
1679 | fact explicitly.
1680 | %
1681 | If not, then it would be useful to briefly consider at least one
1682 | non-image domain to investigate if the technique could apply
1683 | to other domains as well.
1684 | %
1685 | Audio has properties most similar to images (high dimensional
1686 | mostly-continuous input space) and is therefore an easy point
1687 | of comparison.
1688 | %
1689 | Language processing is much different (due to the
1690 | inherently discrete input space) and therefore defenses are
1691 | often harder to apply to this alternate domain.
1692 | 
1693 | \subsection{Report the Per-Example Attack Success Rate}
1694 | \label{sec:reportmeanmin}
1695 | 
1696 | When evaluating attacks, it is important to report attack success rate
1697 | on a per-example
1698 | basis instead of averaged over the entire attack.
1699 | %
1700 | That is, report
1701 | $\mathop{\text{mean}}\limits_{x \in \mathcal{X}} \min\limits_{a \in \mathcal{A}} f(a(x))$, not
1702 | $\mathop{\text{min}}\limits_{a \in \mathcal{A}} \mathop{\text{mean}}\limits_{x \in \mathcal{X}} f(a(x))$.
1703 | 
1704 | This per-example reporting is strictly more useful than a per-attack reporting,
1705 | and is therefore preferable.
1706 | %
1707 | This is true despite the fact that in practice the results for the \emph{worst} attack
1708 | is often very close to the true per-example worst-case bounds.
1709 | 
1710 | 
1711 | This reporting also avoids another common pitfall in which two defenses A and B are compared
1712 | on different attacks X and Y leading to statements such as ``A is more robust than B against attack X
1713 | while B is more robust than A on attack Y''.
1714 | %
1715 | Such results are not useful, even more so if one attack
1716 | is strong and the other is weak.
1717 | %
1718 | Instead, by comparing defenses on
1719 | a per-example based (i.e. the optima per example over all attacks),
1720 | one defense can be selected as being stronger.
1721 | %
1722 | (Note this still requires the defenses are evaluated \emph{under the same threat model}.
1723 | %
1724 | It is not meaningful to compare a $\ell_\infty$-robust defense against a $\ell_0$-robust
1725 | defense.)
1726 | 
1727 | \subsection{Report Attacks Applied}
1728 | \label{sec:describeattacks}
1729 | After having performed a complete evaluation, as discussed above, defense
1730 | evaluations should report all attacks and
1731 | include the relevant hyperparameters.
1732 | %
1733 | It is especially important to report the number of iterations for optimization-based
1734 | attacks.
1735 | 
1736 | Note that it is not necessary to report the results from every
1737 | single attack in all details, especially if 
1738 | an attack is highly similar to another and yields similar results.
1739 | %
1740 | In this case the space can
1741 | rather be used to describe how the attacks were applied.
1742 | %
1743 | A good guideline is as follows: do
1744 | not show a massive table with many similar attacks that
1745 | have not been adapted to the defense.
1746 | %
1747 | Instead, rather show a shorter table with a more
1748 | diverse set of attacks that have each been
1749 | carefully adapted and tuned.
1750 | 
1751 | 
1752 | \section{Conclusion}
1753 | 
1754 | Evaluating adversarial example defenses requires extreme caution
1755 | and skepticism of all results obtained.
1756 | %
1757 | Researchers must be very careful
1758 | to not deceive themselves unintentionally when performing
1759 | evaluations.
1760 | 
1761 | This paper lays out the motivation for how to perform defense evaluations
1762 | and why we believe this is the case.
1763 | %
1764 | We develop a collection of recommendations
1765 | that we have identified as common flaws in adversarial
1766 | example defense evaluations.
1767 | %
1768 | We hope that this checklist will be
1769 | useful both for researchers developing novel defense approaches as
1770 | well as for readers and reviewers to understand if an evaluation is thorough
1771 | and follows currently accepted best practices.
1772 | %
1773 | 
1774 | We do not intend for this paper to be the definitive
1775 | answer to the question ``\emph{what
1776 | experiments should an evaluation contain?}''. Such a comprehensive list would be
1777 | impossible to develop.
1778 | %
1779 | Rather, we hope that
1780 | the recommendations we offer here will help inspire researchers
1781 | with ideas for how to perform their own evaluations.
1782 | %
1783 | \footnote{We believe that even if all defenses followed all of the
1784 | tests recommended in this paper, it would still be valuable
1785 | for researchers to perform re-evaluations of proposed defenses
1786 | (for example, by following the advice given in this paper and investigating
1787 | new, adaptive attacks).
1788 | %
1789 | For every ten defense papers on arXiv, there is just one paper which sets
1790 | out to re-evaluate previously proposed defenses.
1791 | %
1792 | Given the difficulty of performing
1793 | evaluations, it is always useful for future work to perform additional
1794 | experiments to validate or refute the claims made in existing papers.}
1795 | 
1796 | In total, we firmly believe that developing robust machine learning models is
1797 | of great significance, and hope that this document will in some way help
1798 | the broader community reach this important goal.
1799 | 
1800 | \section*{Acknowledgements}
1801 | We would like to thank Catherine Olsson and {\'U}lfar Erlingsson
1802 | for feedback on an early draft of this paper, and David Wagner for
1803 | helpful discussions around content.
1804 | 
1805 | \bibliography{paper}
1806 | \bibliographystyle{iclr2019_conference}
1807 | 
1808 | 
1809 | \end{document}
1810 | 


--------------------------------------------------------------------------------
/version.sh:
--------------------------------------------------------------------------------
 1 | exec 2>/dev/null
 2 | 
 3 | version=$(git rev-parse --short HEAD)
 4 | 
 5 | if ! [ $? -eq 0 ]; then
 6 |     version="unknown"
 7 | fi
 8 | 
 9 | version="\\newcommand*{\\version}{$version}"
10 | 
11 | version_old=$(cat $1)
12 | 
13 | if [ ! "$version" = "$version_old" ]; then
14 |     printf '%s' "$version" > $1
15 | fi
16 | 


--------------------------------------------------------------------------------