├── .gitignore
├── README.md
├── reference_coding_standards.adoc
├── license_example.cpp
├── license_example_DRAFT.py
├── CONTRIBUTING.md
├── TERMS OF USE.md
├── LICENSE.md
├── archive
    └── inference_rules_pre_move
└── submission_rules.adoc


/.gitignore:
--------------------------------------------------------------------------------
1 | .DS_Store
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | This repo contains MLPerf policies, e.g., rules for submitting benchmark results for training and inference (see inference/training policies for the specific rules around inference/training).
2 | 


--------------------------------------------------------------------------------
/reference_coding_standards.adoc:
--------------------------------------------------------------------------------
 1 | :toc:
 2 | :toclevels: 4
 3 | 
 4 | :sectnums:
 5 | 
 6 | # General MLPerf Reference Coding Standards v0.1
 7 | 
 8 | :TOC:
 9 | 
10 | ## Basics
11 | 
12 | The official Python version is 3.x.
13 | 


--------------------------------------------------------------------------------
/license_example.cpp:
--------------------------------------------------------------------------------
 1 | /* Copyright 2018 The MLPerf Authors. All Rights Reserved.
 2 | Licensed under the Apache License, Version 2.0 (the "License");
 3 | you may not use this file except in compliance with the License.
 4 | You may obtain a copy of the License at
 5 |     http://www.apache.org/licenses/LICENSE-2.0
 6 | Unless required by applicable law or agreed to in writing, software
 7 | distributed under the License is distributed on an "AS IS" BASIS,
 8 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 9 | See the License for the specific language governing permissions and
10 | limitations under the License.
11 | ==============================================================================*/
12 | 


--------------------------------------------------------------------------------
/license_example_DRAFT.py:
--------------------------------------------------------------------------------
 1 | # Copyright 2018 The MLPerf Authors. All Rights Reserved.
 2 | #
 3 | # Licensed under the Apache License, Version 2.0 (the "License");
 4 | # you may not use this file except in compliance with the License.
 5 | # You may obtain a copy of the License at
 6 | #
 7 | #     http://www.apache.org/licenses/LICENSE-2.0
 8 | #
 9 | # Unless required by applicable law or agreed to in writing, software
10 | # distributed under the License is distributed on an "AS IS" BASIS,
11 | # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12 | # See the License for the specific language governing permissions and
13 | # limitations under the License.
14 | # =============================================================================
15 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing guidelines
 2 | 
 3 | ## Pull Request Checklist
 4 | 
 5 | Before sending your pull requests, make sure you followed this list.
 6 | 
 7 | - Read [contributing guidelines](CONTRIBUTING.md)
 8 | - Ensure you have signed the [Contributor License Agreement (CLA)](https://cla.developers.google.com/).
 9 | - (Note: additional technical details TBD by community.)
10 | 
11 | ## How to become a contributor and submit your own code
12 | 
13 | ### Contributor License Agreements
14 | 
15 | We'd love to accept your patches! Before we can take them, we have to jump a couple of legal hurdles.
16 | 
17 | Please fill out either the individual or corporate Contributor License Agreement (CLA).
18 | 
19 |   * If you are an individual writing original source code and you're sure you own the intellectual property, then you'll need to sign an [individual CLA](https://code.google.com/legal/individual-cla-v1.0.html).
20 |   * If you work for a company that wants to allow you to contribute your work, then you'll need to sign a [corporate CLA](https://code.google.com/legal/corporate-cla-v1.0.html).
21 | 
22 | Follow either of the two links above to access the appropriate CLA and instructions for how to sign and return it. Once we receive it, we'll be able to accept your pull requests.
23 | 
24 | ***NOTE***: Only original source code from you and other people that have signed the CLA can be accepted into the main repository. (Note: we need to modify this to allow third party code under Apache2 or MIT license with additional review.)
25 | 
26 | ### Contributing code
27 | 
28 | If you have improvements to MLPerf, send us your pull requests! For those
29 | just getting started, Github has a [howto](https://help.github.com/articles/using-pull-requests/).
30 | 
31 | ### Contribution guidelines and standards
32 | 
33 | (Note: Technical details TBD by community.)
34 | 
35 | #### General guidelines and philosophy for contribution
36 | 
37 | (Note: Technical details TBD by community.)
38 | 
39 | #### License
40 | 
41 | Include a license at the top of new files.
42 | 
43 | * [C/C++ license example](https://github.com/mlperf/policies/blob/master/license_example_DRAFT.cpp)
44 | * [Python license example](https://github.com/mlperf/policies/blob/master/license_example_DRAFT.py)
45 | 
46 | #### C++ coding style
47 | 
48 | (Note: Technical details TBD by community.)
49 | 
50 | #### Python coding style
51 | 
52 | (Note: Technical details TBD by community.)
53 | 
54 | #### Running sanity check
55 | 
56 | (Note: Technical details TBD by community.)
57 | 
58 | #### Running unit tests
59 | 
60 | (Note: Technical details TBD by community.)
61 | 


--------------------------------------------------------------------------------
/TERMS OF USE.md:
--------------------------------------------------------------------------------
 1 | # MLPerf Terms of Use
 2 | 
 3 | These are the Terms of Use for MLPerf, including benchmark results and the MLPerf.org website.
 4 | 
 5 | ## MLPerf name and logo are trademarks; use must conform to this document
 6 | 
 7 | The MLPerf name and logo are trademarks. Any use of the MLPerf trademark must conform to these terms of use. The MLPerf organization reserves the right to solely determine if a use of its name or logo is acceptable. If you use a benchmark result in a manner referencing an MLPerf trademark, you must conform to the following guidelines. 
 8 | 
 9 | ## MLPerf results must clearly identify basic details in the main text, table, or figure
10 | 
11 | Any use of results must clearly identify the following in the main text, table, or figure: submitting organization, benchmark name, and system under test. For example:
12 | 
13 | _SmartAI Corp achieved a score of 0.6 on the MLPerf Image Classification benchmark using a SmartCluster._
14 | 
15 | For Closed Division benchmarks, the model name may be used instead of the benchmark name, e.g. "SSD" instead of "Object Detection (light-weight)".
16 | 
17 | ## MLPerf results must include a detailed footnote
18 | 
19 | Any use of results of must include the following in a footnote: benchmark suite, version, and division, benchmark name and scenario if applicable, date and source of retrieval, MLPerf result ID (major-version.minor-version.entry.benchmark), and clear reference to MLPerf trademark. For example:
20 | 
21 | _[1] MLPerf v0.5 Inference Closed ResNet-v1.5 offline. Retrieved from www.mlperf.org 21 December 2018, entry 0.5-12. MLPerf name and logo are trademarks. See www.mlperf.org for more information._
22 | 
23 | ## MLPerf results may only be compared against similar MLPerf results
24 | 
25 | Whether comparing official results or unverified results, comparisons must be made between results of the same benchmark and scenario from compatible versions of an MLPerf benchmark. Compatible versions are determined by the MLPerf organization. MLPerf results may not be compared against non-MLPerf results.
26 | 
27 | MLPerf Training v0.5 and v0.6 are not directly compatible and should not be compared between submitters. A given system’s v0.5 and v0.6 submissions may be compared with each other provided that the base hardware is the same and the comparisons are done with sufficient analysis to remove influence of benchmark changes such as overheads and quality targets.
28 | 
29 | ## When comparing MLPerf results, you must identify any benchmark differences
30 | 
31 | When comparing results the main text, table, or figure must clearly identify any difference in version, division, category, official or unverified status, or chipcount. When comparing Open and Closed division results any ways in which the Open result would not qualify as a Closed result must be identified. 
32 | 
33 | _SmartAI Corp achieved a score of 0.6 on the MLPerf Image Classification benchmark using a SmartCluster with 8 chips in the Open Divsion with an accuracy that is only 0.01% lower than the Closed requirement, which is faster than the result of 7.2 achieved by LessSmartAI Corp with 16 chips in the Closed Division._
34 | 
35 | ## Official results must be clearly distinguished from unofficial results
36 | 
37 | You may cite either official results obtained from the MLPerf results page or unofficial results measured independently. If you cite an unofficial result you must clearly specify that the result is “Unverified” in text and clearly state “Result not verified by MLPerf” in a footnote. The result must comply with the letter and spirit of the relevant MLPerf rules. For example:
38 | 
39 | _SmartAI Corp announced an unverified score of 0.3 on the MLPerf Image Classification benchmark using a SmartCluster running MLFramework v4.1 [1]._
40 | 
41 | _[1] MLPerf v0.5 Training ResNet-v1.5; Result not verified by MLPerf. MLPerf name and logo are trademarks. See www.mlperf.org for more information._
42 | 
43 | ## MLPerf allows but does not endorse combining results of benchmarks
44 | 
45 | Users may see fit to combine or aggregate results from multiple MLPerf benchmark tests and/or other 3rd party results. If publicly disclosed, these composite results must cite MLPerf as required above and clearly describe the method of combination. However the composite result is not sanctioned by MLPerf and may not be represented as an official MLPerf result or score.
46 | 
47 | ## Comparisons based on secondary or derived metrics must be explicit
48 | 
49 | Each MLPerf benchmark has a primary metric, for instance time-to-train for Training Image Classification. Any comparison based on different or derived metric such as power, cost, model size/architecture, accuracy, etc. must make the basis for comparison clear in the text and in a footnote. Secondary and derived metrics must not be presented as official MLPerf metrics.
50 | 
51 | _Prestigious Research University has created a new neural network model called MagicEightBall that is
52 | 100% accurate for Top-1 image classification on the MLPerf v0.5 Training Open Division Image Classification benchmark using a cluster of 10 SmartChips running MLFramework v4.1 [1]._
53 | 
54 | _[1] Accuracy is not the primary metric of MLPerf. MLPerf name and logo are registered trademarks. See www.mlperf.org for more information._
55 | 
56 | ## The MLPerf website is hosted by Google; the following terms apply
57 | 
58 | In addition to the [Google Terms of Service](https://policies.google.com/terms?hl=en) and [Privacy Policy](https://policies.google.com/privacy?hl=en), the following terms will apply to your use of the MLPerf website (the “Website”), service (the “Service), and software (the “Software”). If you do not accept these terms and conditions in full, you must terminate the use of the Website, Service, and Software immediately.
59 | THE SERVICE AND SOFTWARE ARE PROVIDED TO YOU "AS IS" WITHOUT ANY WARRANTY OR IMPLIED WARRANTY. NEITHER MLPERF NOR ITS PARENT, SUBSIDIARIES, OR AFFILIATES (“MLPERF”) TAKE ANY RESPONSIBILITY FOR ANY DIFFICULTIES YOU MAY ENCOUNTER WITH THE SERVICE OR SOFTWARE. MLPERF DOES NOT WARRANT THAT THE FUNCTIONS CONTAINED IN THE SOFTWARE OR OTHER PRODUCTS WILL MEET YOUR REQUIREMENTS, OR THAT THE OPERATION OF THE SOFTWARE WILL BE UNINTERRUPTED OR ERROR-FREE, OR THAT DEFECTS IN THE SOFTWARE OR OTHER PRODUCTS WILL BE CORRECTED. NO ORAL OR WRITTEN INFORMATION, BENCHMARKS, BENCHMARK RESULTS, OR ADVICE GIVEN BY MLPERF (THE “MLPERF CONTENT”) SHALL CREATE ANY WARRANTY.
60 | 
61 | MLPERF IS NOT RESPONSIBLE FOR ANY PRODUCTS YOU MAY CHOOSE TO PURCHASE OR CHOOSE NOT TO PURCHASE AS A RESULT OF THE MLPERF CONTENT. MLPERF DISCLAIMS ANY AND ALL LOSS OR LIABILITY RELATED TO YOUR USE OF THE WEBSITE, THE MLPERF CONTENT GIVEN TO YOU ON THE WEBSITE OR SOFTWARE FROM THE WEBSITE, OR THIS AGREEMENT.
62 | 
63 | MLPerf is not liable for, among other things, any loss of data, hardware or software, or any liability resulting from: access delays; access interruptions; viruses; hackers; crackers; data non-delivery or mis-delivery; negligent acts, grossly negligent acts, or omissions by MLPerf; errors in any information, goods, or documents obtained due to the MLPerf Content; or force majeure, or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other action, arising out of or in connection with the use or performance of the Website, the MLPerf Content, the Service or the Software.
64 | 
65 | MLPerf makes no representations whatsoever about any other website that you may access through the Website. MLPerf has no control over the content or claims of websites outside the MLPerf domain, and does not endorse or accept any responsibility for the content of such websites.
66 | 
67 | MLPERF, ITS REPRESENTATIVES, LICENSORS, AND PARTNERS WILL NOT BE LIABLE FOR ANY DAMAGES (INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL, SPECIAL OR PUNITIVE).
68 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
  1 |                                  Apache License
  2 |                            Version 2.0, January 2004
  3 |                         http://www.apache.org/licenses/
  4 | 
  5 |    TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
  6 | 
  7 |    1. Definitions.
  8 | 
  9 |       "License" shall mean the terms and conditions for use, reproduction,
 10 |       and distribution as defined by Sections 1 through 9 of this document.
 11 | 
 12 |       "Licensor" shall mean the copyright owner or entity authorized by
 13 |       the copyright owner that is granting the License.
 14 | 
 15 |       "Legal Entity" shall mean the union of the acting entity and all
 16 |       other entities that control, are controlled by, or are under common
 17 |       control with that entity. For the purposes of this definition,
 18 |       "control" means (i) the power, direct or indirect, to cause the
 19 |       direction or management of such entity, whether by contract or
 20 |       otherwise, or (ii) ownership of fifty percent (50%) or more of the
 21 |       outstanding shares, or (iii) beneficial ownership of such entity.
 22 | 
 23 |       "You" (or "Your") shall mean an individual or Legal Entity
 24 |       exercising permissions granted by this License.
 25 | 
 26 |       "Source" form shall mean the preferred form for making modifications,
 27 |       including but not limited to software source code, documentation
 28 |       source, and configuration files.
 29 | 
 30 |       "Object" form shall mean any form resulting from mechanical
 31 |       transformation or translation of a Source form, including but
 32 |       not limited to compiled object code, generated documentation,
 33 |       and conversions to other media types.
 34 | 
 35 |       "Work" shall mean the work of authorship, whether in Source or
 36 |       Object form, made available under the License, as indicated by a
 37 |       copyright notice that is included in or attached to the work
 38 |       (an example is provided in the Appendix below).
 39 | 
 40 |       "Derivative Works" shall mean any work, whether in Source or Object
 41 |       form, that is based on (or derived from) the Work and for which the
 42 |       editorial revisions, annotations, elaborations, or other modifications
 43 |       represent, as a whole, an original work of authorship. For the purposes
 44 |       of this License, Derivative Works shall not include works that remain
 45 |       separable from, or merely link (or bind by name) to the interfaces of,
 46 |       the Work and Derivative Works thereof.
 47 | 
 48 |       "Contribution" shall mean any work of authorship, including
 49 |       the original version of the Work and any modifications or additions
 50 |       to that Work or Derivative Works thereof, that is intentionally
 51 |       submitted to Licensor for inclusion in the Work by the copyright owner
 52 |       or by an individual or Legal Entity authorized to submit on behalf of
 53 |       the copyright owner. For the purposes of this definition, "submitted"
 54 |       means any form of electronic, verbal, or written communication sent
 55 |       to the Licensor or its representatives, including but not limited to
 56 |       communication on electronic mailing lists, source code control systems,
 57 |       and issue tracking systems that are managed by, or on behalf of, the
 58 |       Licensor for the purpose of discussing and improving the Work, but
 59 |       excluding communication that is conspicuously marked or otherwise
 60 |       designated in writing by the copyright owner as "Not a Contribution."
 61 | 
 62 |       "Contributor" shall mean Licensor and any individual or Legal Entity
 63 |       on behalf of whom a Contribution has been received by Licensor and
 64 |       subsequently incorporated within the Work.
 65 | 
 66 |    2. Grant of Copyright License. Subject to the terms and conditions of
 67 |       this License, each Contributor hereby grants to You a perpetual,
 68 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 69 |       copyright license to reproduce, prepare Derivative Works of,
 70 |       publicly display, publicly perform, sublicense, and distribute the
 71 |       Work and such Derivative Works in Source or Object form.
 72 | 
 73 |    3. Grant of Patent License. Subject to the terms and conditions of
 74 |       this License, each Contributor hereby grants to You a perpetual,
 75 |       worldwide, non-exclusive, no-charge, royalty-free, irrevocable
 76 |       (except as stated in this section) patent license to make, have made,
 77 |       use, offer to sell, sell, import, and otherwise transfer the Work,
 78 |       where such license applies only to those patent claims licensable
 79 |       by such Contributor that are necessarily infringed by their
 80 |       Contribution(s) alone or by combination of their Contribution(s)
 81 |       with the Work to which such Contribution(s) was submitted. If You
 82 |       institute patent litigation against any entity (including a
 83 |       cross-claim or counterclaim in a lawsuit) alleging that the Work
 84 |       or a Contribution incorporated within the Work constitutes direct
 85 |       or contributory patent infringement, then any patent licenses
 86 |       granted to You under this License for that Work shall terminate
 87 |       as of the date such litigation is filed.
 88 | 
 89 |    4. Redistribution. You may reproduce and distribute copies of the
 90 |       Work or Derivative Works thereof in any medium, with or without
 91 |       modifications, and in Source or Object form, provided that You
 92 |       meet the following conditions:
 93 | 
 94 |       (a) You must give any other recipients of the Work or
 95 |           Derivative Works a copy of this License; and
 96 | 
 97 |       (b) You must cause any modified files to carry prominent notices
 98 |           stating that You changed the files; and
 99 | 
100 |       (c) You must retain, in the Source form of any Derivative Works
101 |           that You distribute, all copyright, patent, trademark, and
102 |           attribution notices from the Source form of the Work,
103 |           excluding those notices that do not pertain to any part of
104 |           the Derivative Works; and
105 | 
106 |       (d) If the Work includes a "NOTICE" text file as part of its
107 |           distribution, then any Derivative Works that You distribute must
108 |           include a readable copy of the attribution notices contained
109 |           within such NOTICE file, excluding those notices that do not
110 |           pertain to any part of the Derivative Works, in at least one
111 |           of the following places: within a NOTICE text file distributed
112 |           as part of the Derivative Works; within the Source form or
113 |           documentation, if provided along with the Derivative Works; or,
114 |           within a display generated by the Derivative Works, if and
115 |           wherever such third-party notices normally appear. The contents
116 |           of the NOTICE file are for informational purposes only and
117 |           do not modify the License. You may add Your own attribution
118 |           notices within Derivative Works that You distribute, alongside
119 |           or as an addendum to the NOTICE text from the Work, provided
120 |           that such additional attribution notices cannot be construed
121 |           as modifying the License.
122 | 
123 |       You may add Your own copyright statement to Your modifications and
124 |       may provide additional or different license terms and conditions
125 |       for use, reproduction, or distribution of Your modifications, or
126 |       for any such Derivative Works as a whole, provided Your use,
127 |       reproduction, and distribution of the Work otherwise complies with
128 |       the conditions stated in this License.
129 | 
130 |    5. Submission of Contributions. Unless You explicitly state otherwise,
131 |       any Contribution intentionally submitted for inclusion in the Work
132 |       by You to the Licensor shall be under the terms and conditions of
133 |       this License, without any additional terms or conditions.
134 |       Notwithstanding the above, nothing herein shall supersede or modify
135 |       the terms of any separate license agreement you may have executed
136 |       with Licensor regarding such Contributions.
137 | 
138 |    6. Trademarks. This License does not grant permission to use the trade
139 |       names, trademarks, service marks, or product names of the Licensor,
140 |       except as required for reasonable and customary use in describing the
141 |       origin of the Work and reproducing the content of the NOTICE file.
142 | 
143 |    7. Disclaimer of Warranty. Unless required by applicable law or
144 |       agreed to in writing, Licensor provides the Work (and each
145 |       Contributor provides its Contributions) on an "AS IS" BASIS,
146 |       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147 |       implied, including, without limitation, any warranties or conditions
148 |       of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149 |       PARTICULAR PURPOSE. You are solely responsible for determining the
150 |       appropriateness of using or redistributing the Work and assume any
151 |       risks associated with Your exercise of permissions under this License.
152 | 
153 |    8. Limitation of Liability. In no event and under no legal theory,
154 |       whether in tort (including negligence), contract, or otherwise,
155 |       unless required by applicable law (such as deliberate and grossly
156 |       negligent acts) or agreed to in writing, shall any Contributor be
157 |       liable to You for damages, including any direct, indirect, special,
158 |       incidental, or consequential damages of any character arising as a
159 |       result of this License or out of the use or inability to use the
160 |       Work (including but not limited to damages for loss of goodwill,
161 |       work stoppage, computer failure or malfunction, or any and all
162 |       other commercial damages or losses), even if such Contributor
163 |       has been advised of the possibility of such damages.
164 | 
165 |    9. Accepting Warranty or Additional Liability. While redistributing
166 |       the Work or Derivative Works thereof, You may choose to offer,
167 |       and charge a fee for, acceptance of support, warranty, indemnity,
168 |       or other liability obligations and/or rights consistent with this
169 |       License. However, in accepting such obligations, You may act only
170 |       on Your own behalf and on Your sole responsibility, not on behalf
171 |       of any other Contributor, and only if You agree to indemnify,
172 |       defend, and hold each Contributor harmless for any liability
173 |       incurred by, or claims asserted against, such Contributor by reason
174 |       of your accepting any such warranty or additional liability.
175 | 
176 |    END OF TERMS AND CONDITIONS
177 | 
178 |    APPENDIX: How to apply the Apache License to your work.
179 | 
180 |       To apply the Apache License to your work, attach the following
181 |       boilerplate notice, with the fields enclosed by brackets "[]"
182 |       replaced with your own identifying information. (Don't include
183 |       the brackets!)  The text should be enclosed in the appropriate
184 |       comment syntax for the file format. We also recommend that a
185 |       file or class name and description of purpose be included on the
186 |       same "printed page" as the copyright notice for easier
187 |       identification within third-party archives.
188 | 
189 |    Copyright 2018 The MLPerf Authors
190 | 
191 |    Licensed under the Apache License, Version 2.0 (the "License");
192 |    you may not use this file except in compliance with the License.
193 |    You may obtain a copy of the License at
194 | 
195 |        http://www.apache.org/licenses/LICENSE-2.0
196 | 
197 |    Unless required by applicable law or agreed to in writing, software
198 |    distributed under the License is distributed on an "AS IS" BASIS,
199 |    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200 |    See the License for the specific language governing permissions and
201 |    limitations under the License.
202 | 


--------------------------------------------------------------------------------
/archive/inference_rules_pre_move:
--------------------------------------------------------------------------------
  1 | :toc:
  2 | :toclevels: 4
  3 | 
  4 | :sectnums:
  5 | 
  6 | = MLPerf Inference Rules
  7 | 
  8 | Version 0.5 April 16th, 2019
  9 | 
 10 | Points of contact: David Kanter (dkanter@gmail.com), Vijay Janapa Reddi
 11 | (vjreddi@g.harvard.edu)
 12 | 
 13 | == Overview
 14 | 
 15 | This document describes how to implement one or more benchmarks in the MLPerf
 16 | Inference Suite and how to use those implementations to measure the performance
 17 | of an ML system performing inference.
 18 | 
 19 | The MLPerf name and logo are trademarks. In order to refer to a result using the
 20 | MLPerf name, the result must conform to the letter and spirit of the rules
 21 | specified in this document. The MLPerf organization reserves the right to solely
 22 | determine if a use of its name or logo is acceptable.
 23 | 
 24 | === Definitions (read this section carefully)
 25 | 
 26 | The following definitions are used throughout this document:
 27 | 
 28 | A _sample_ is the unit on which inference is run. E.g., an image, or a sentence.
 29 | 
 30 | A _query_ is a set of N samples that are issued to an inference system
 31 | together. N is a positive integer. For example, a single query contains 8
 32 | images.
 33 | 
 34 | _Quality_ always refers to a model’s ability to produce “correct” outputs.
 35 | 
 36 | A _system under test_ consists of a defined set of hardware and software
 37 | resources that will be measured for performance.  The hardware resources may
 38 | include processors, accelerators, memories, disks, and interconnect. The
 39 | software resources may include an operating system, compilers, libraries, and
 40 | drivers that significantly influences the running time of a benchmark.
 41 | 
 42 | A _reference implementation_ is a specific implementation of a benchmark
 43 | provided by the MLPerf organization.  The reference implementation is the
 44 | canonical implementation of a benchmark. All valid submissions of a benchmark
 45 | must be *equivalent* to the reference implementation.
 46 | 
 47 | A _run_ is a complete execution of a benchmark implementation on a system under
 48 | the control of the load generator that consists of completing a set of inference
 49 | queries, including data pre- and post-processing, meeting a latency requirement
 50 | and a quality requirement in accordance with a scenario.
 51 | 
 52 | A _run result_ consists of the scenario-specific metric.
 53 | 
 54 | == General rules
 55 | 
 56 | The following rules apply to all benchmark implementations.
 57 | 
 58 | === Strive to be fair
 59 | 
 60 | Benchmarking should be conducted to measure the framework and system performance
 61 | as fairly as possible. Ethics and reputation matter.
 62 | 
 63 | === System and framework must be consistent
 64 | 
 65 | The same system and framework must be used for a suite result or set of
 66 | benchmark results reported in a single context.
 67 | 
 68 | === System and framework must be available
 69 | 
 70 | If you are measuring the performance of a publicly available and widely-used
 71 | system or framework, you must use publicly available and widely-used versions of
 72 | the system or framework.
 73 | 
 74 | If you are measuring the performance of an experimental framework or system, you
 75 | must make the system and framework you use available upon demand for
 76 | replication.
 77 | 
 78 | === Benchmark implementations must be shared
 79 | 
 80 | Source code used for the benchmark implementations must be open-sourced under a
 81 | license that permits a commercial entity to freely use the implementation for
 82 | benchmarking. The code must be available as long as the results are actively
 83 | used.
 84 | 
 85 | === Non-determinism is restricted
 86 | 
 87 | The only forms of acceptable non-determinism are:
 88 | 
 89 | * Floating point operation order
 90 | 
 91 | * Random traversal of the inputs
 92 | 
 93 | * Rounding
 94 | 
 95 | All random numbers must be based on a fixed random seed and deterministic random
 96 | number generator. The fixed random number generator is the Mersenne Twister
 97 | 19937 generator (std::mt19937). The fixed seed will be announced a week before
 98 | the benchmark submission deadline.
 99 | 
100 | === Benchmark detection is not allowed
101 | 
102 | The framework and system should not detect and behave differently for
103 | benchmarks.
104 | 
105 | === Input-based optimization is not allowed
106 | 
107 | The implementation should not encode any information about the content of the
108 | input dataset in any form.
109 | 
110 | === Replicability is mandatory
111 | 
112 | Results that cannot be replicated are not valid results.
113 | 
114 | == Scenarios
115 | 
116 | In order to enable representative testing of a wide variety of inference
117 | platforms and use cases, MLPerf has defined four different scenarios as
118 | described in the table below.
119 | 
120 | |===
121 | |Scenario |Query Generation |Duration |Samples/query |Latency Constraint |Tail Latency | Performance Metric
122 | |Single stream |LoadGen sends next query as soon as SUT completes the previous query | 1024 queries and 60 seconds |1 |None |90% | 90%-ile measured latency 
123 | |Multiple stream |LoadGen sends a new query every _latency constraint_ if the SUT has completed the prior query, otherwise the new query is dropped and is counted as one overtime query | 24,576 queries and 60 seconds |Variable, see metric |Benchmark specific |90% | Maximum number of inferences per query supported
124 | |Server |LoadGen sends new queries to the SUT according to a Poisson distribution, overtime queries must not exceed 2X the latency bound |24,576 queries and 60 seconds |1 |Benchmark specific |90% | Maximum Poisson throughput parameter supported
125 | |Offline |LoadGen sends all queries to the SUT at start | 24,576 queries and 60 seconds |All |None |N/A | Measured throughput
126 | |===
127 | 
128 | The number of queries is selected to ensure sufficient statistical confidence in
129 | the reported metric. Specifically, the top line in the following table. Lower
130 | lines are being evaluated for future versions of MLPerf Inference (e.g., 95%
131 | tail latency for v0.6 and 99% tail latency for v0.7).
132 | 
133 | |===
134 | |Tail Latency Percentile |Confidence Interval |Margin-of-Error |Inferences |Rounded Inferences
135 | |90%|99%|0.50%|23,886|3*2^13 = 24,576
136 | |95%|99%|0.25%|50,425|7*2^13 = 57,344
137 | |99%|99%|0.05%|262,742|33*2^13 = 270,336
138 | |===
139 | 
140 | A submission may comprise any combination of benchmark and scenario results.
141 | 
142 | == Benchmarks
143 | 
144 | The MLPerf organization provides a reference implementation of each benchmark,
145 | which includes the following elements: Code that implements the model in a
146 | framework.  A plain text “README.md” file that describes:
147 | 
148 | * Problem
149 | 
150 | ** Dataset/Environment
151 | 
152 | ** Publication/Attribution
153 | 
154 | ** Data pre- and post-processing
155 | 
156 | ** Performance, accuracy, and calibration data sets
157 | 
158 | ** Test data traversal order (CHECK)
159 | 
160 | * Model
161 | 
162 | ** Publication/Attribution
163 | 
164 | ** List of layers
165 | 
166 | ** Weights and biases
167 | 
168 | * Quality and latency
169 | 
170 | ** Quality target
171 | 
172 | ** Latency target(s)
173 | 
174 | * Directions
175 | 
176 | ** Steps to configure machine
177 | 
178 | ** Steps to download and verify data
179 | 
180 | ** Steps to run and time
181 | 
182 | A “download_dataset” script that downloads the accuracy, speed, and calibration
183 | datasets.
184 | 
185 | A “verify_dataset” script that verifies the dataset against the checksum.
186 | 
187 | A “run_and_time” script that executes the benchmark and reports the wall-clock
188 | time.
189 | 
190 | === Benchmarks
191 | 
192 | The benchmark suite consists of the benchmarks shown in the following
193 | table. Quality and latency targets are still being finalized.
194 | 
195 | |===
196 | |Area |Task |Model |Dataset |Quality |Latency constraint
197 | |Vision |Image classification |Resnet50-v1.5 |ImageNet (224x224) | ?? | ??
198 | |Vision |Image classification |MobileNets-v1 224 |ImageNet  (224x224) | ?? | ?? 
199 | |Vision |Object detection |SSD-ResNet34 |COCO (1200x1200) | ?? | ?? 
200 | |Vision |Object detection |SSD-MobileNets-v1 |COCO (300x300) | ?? | ?? 
201 | |Language |Machine translation |GMNT |WMT16 | ?? | ?? 
202 | |===
203 | 
204 | == Load Generator
205 | 
206 | === LoadGen Operation
207 | 
208 | The LoadGen is provided in C++ with Python bindings and must be used by all
209 | submissions. The LoadGen is responsible for:
210 | 
211 | * Generating the queries according to one of the scenarios.
212 | 
213 | * Tracking the latency of queries.
214 | 
215 | * Validating the accuracy of the results.
216 | 
217 | * Computing final metrics.
218 | 
219 | Latency is defined as the time from LoadGen passing a query to the SUT, to the
220 | time it receives a reply.
221 | 
222 | * Single-stream: LoadGen measures average latency using a single test run. For
223 | the test run, LoadGen sends an initial query then continually sends the next
224 | query as soon as the previous query is processed.
225 | 
226 | * Multi-stream: LoadGen determines the maximum supported number of streams using
227 | multiple test runs. Each test run evaluates a specific integer number of
228 | streams. For a specific number of streams, queries are generated with a number
229 | of samples per query equal to the number of streams tested. All samples in a
230 | query will be allocated contiguously in memory. LoadGen will use a binary search
231 | to find a candidate value. It will then verify stability by testing the value 5
232 | times. If one run fails, it will reduce the number of streams by one and then
233 | try again.
234 | 
235 | * Server: LoadGen determines the system throughput using multiple test
236 | runs. Each test run evaluates a specific throughput value in queries-per-second
237 | (QPS). For a specific throughput value, queries are generated at that QPS using
238 | a Poisson distribution. LoadGen will use a binary search to find a candidate
239 | value. It will then verify stability by testing the value 5 times. If one run
240 | fails, it will reduce the value by a small delta then try again.
241 | 
242 | * Offline: LoadGen measures throughput using a single test run. For the test
243 | run, LoadGen sends all queries at once.
244 | 
245 | The run procedure is as follows:
246 | 
247 | 1. LoadGen signals system under test (SUT).
248 | 
249 | 2. SUT starts up and signals readiness. 
250 | 
251 | 3. LoadGen starts clock and begins generating queries.
252 | 
253 | 4. LoadGen stops generating queries as soon as the benchmark-specific minimum
254 | number of queries have been generated and the benchmark specific minimum time
255 | has elapsed.
256 | 
257 | 5. LoadGen waits for all queries to complete, and errors if all queries fail to
258 | complete.
259 | 
260 | 6. LoadGen computes metrics for the run.
261 | 
262 | The execution of LoadGen is restricted as follows:
263 | 
264 | * LoadGen must run on the processor that most faithfully simulates queries
265 |   arriving from the most logical source, which is usually the network or an I/O
266 |   device such as a camera. For example, if the most logical source is the
267 |   network and the system is characterized as host - accelerator, then LoadGen
268 |   should run on the host unless the accelerator incorporates a NIC.
269 | 
270 | * The trace generated by LoadGen must be stored in the non-HBM DRAM that most
271 |   faithfully simulates queries arriving from the most logical source, which is
272 |   usually the network or an I/O device such as a camera. It may be
273 |   pinned. Submitters need prior approval for anything that is not DRAM.
274 | 
275 | * Caching of any queries, any query parameters, or any intermediate results is
276 |   prohibited.
277 | 
278 | LoadGen generates queries based on trace. The trace is constructed by uniformly
279 | sampling (with replacement) from a library based on a fixed random seed and
280 | deterministic generator. The trace is usually pre-generated, but may optionally
281 | be incrementally generated if it does not fit in memory. LoadGen validates
282 | accuracy via a separate test run that use each sample in the test library
283 | exactly once but is otherwise identical to the above normal metric run.
284 | 
285 | == Divisions
286 | 
287 | There are two divisions of the benchmark suite, the Closed division and the Open
288 | division.
289 | 
290 | === Closed Division
291 | 
292 | The Closed division requires using pre-processing, post-processing, and model
293 | that is equivalent to the reference or alternative implementation.  The closed
294 | division allows calibration for quantization and does not allow any retraining.
295 | 
296 | The unqualified name “MLPerf” must be used when referring to a Closed Division
297 | suite result, e.g. “a MLPerf result of 4.5.”
298 | 
299 | === Open Division
300 | 
301 | The Open division allows using arbitrary pre- or post-processing and model,
302 | including retraining.  The qualified name “MLPerf Open” must be used when
303 | referring to an Open Division suite result, e.g. “a MLPerf Open result of 7.2.”
304 | 
305 | == Data Sets
306 | 
307 | For each benchmark, MLPerf will provide pointers to:
308 | 
309 | * An accuracy data set, to be used to determine whether a submission meets the
310 |   quality target, and used as a validation set
311 | 
312 | * A speed/performance data set that is a subset of the accuracy data set to be
313 |   used to measure performance
314 | 
315 | For each benchmark except GNMT, MLPerf will provide pointers to:
316 | 
317 | * A calibration data set, to be used for quantization (see quantization
318 |   section), that is a small subset of the training data set used to generate the
319 |   weights
320 | 
321 | Each reference implementation shall include a script to verify the datasets
322 | using a checksum. The dataset must be unchanged at the start of each run.
323 | 
324 | === Pre- and post-processing
325 | 
326 | All imaging benchmarks take uncropped uncompressed bitmap as inputs, NMT takes
327 | text.
328 | 
329 | Sample-independent pre-processing that matches the reference model is
330 | untimed. However, it must be pre-approved and added to the following list:
331 | 
332 | * May resize to processed size (e.g. SSD-large)
333 | 
334 | * May reorder channels / do arbitrary transpositions
335 | 
336 | * May pad to arbitrary size (don’t be creative)
337 | 
338 | * May do a single, consistent crop
339 | 
340 | * Mean subtraction and normalization provided reference model expect those to be
341 |   done
342 | 
343 | * May quantize image data from fp32 to int8 and between signed and unsigned
344 | 
345 | Any other pre- and post-processing time is included in the wall-clock time for a
346 | run result.
347 | 
348 | === Test Data Traversal Order
349 | 
350 | Test data is determined by the LoadGen. For scenarios where processing multiple
351 | samples can occur (i.e., server, multi-stream, and offline), any ordering is
352 | allowed subject to latency requirements.
353 | 
354 | == Model
355 | 
356 | CLOSED: For v0.5, MLPerf provides a reference implementation in a first
357 | framework and an alternative implementation in a second framework in accordance
358 | with the table below.  The benchmark implementation must use a model that is
359 | equivalent to the reference implementation or the alternative implementation, as
360 | defined by the remainder of this section.
361 | 
362 | |===
363 | |Area |Task |Model |Reference implementation |Alternative implementation
364 | |Vision |Image classification |Resnet50-v1.5 |TensorFlow |PyTorch/ONNX 
365 | |Vision |Image classification |MobileNets-v1 224 |TensorFlow/TensorFlow Lite |PyTorch/ONNX  
366 | |Vision |Object detection |SSD-ResNet34 |PyTorch/ONNX |TensorFlow/TensorFlow Lite 
367 | |Vision |Object detection |SSD-MobileNets-v1 |TensorFlow |PyTorch/ONNX 
368 | |Language |Machine translation |GMNT |TensorFlow |PyTorch/ONNX 
369 | |===
370 | 
371 | OPEN: The benchmark implementation may use a different model to perform the same
372 | task. Retraining is allowed.
373 | 
374 | === Weight Definition and Quantization
375 | 
376 | CLOSED: MLPerf will provide trained weights and biases in fp32 format for both
377 | the reference and alternative implementations.
378 | 
379 | MLPerf will provide a calibration data set for all models except
380 | GNMT. Submitters may do arbitrary purely mathematical, reproducible quantization
381 | using only the calibration data and weight and bias tensors from the benchmark
382 | owner provided model to any combination of permissive whitelist numerical format
383 | that achieves the desired quality. The quantization method must be publicly
384 | described at a level where it could be reproduced.  The whitelist currently
385 | includes:
386 | 
387 | * INT8
388 | * INT16
389 | * UINT8
390 | * UINT16
391 | * FP11 (1-bit sign, 5-bit exponent, 5-bit mantissa)
392 | * FP16
393 | * bfloat16
394 | * FP32
395 | 
396 | To be considered principled, the description of the quantization method must be
397 | much much smaller than the non-zero weights it produces.
398 | 
399 | Calibration is allowed and must only use the calibration data set provided by
400 | the benchmark owner.
401 | 
402 | Additionally, for image classification using MobileNets-v1 224 and object
403 | detection using SSD-MobileNets-v1, MLPerf will provide a retrained INT8
404 | (asymmetric for TFLite and symmetric for pyTorch/ONNX) model. Model weights and
405 | input activations are scaled per tensor, and must preserve the same shape modulo
406 | padding. Convolution layers are allowed to be in either NCHW or NHWC format.  No
407 | other retraining is allowed.
408 | 
409 | OPEN: Weights and biases must be initialized to the same values for each run,
410 | any quantization scheme is allowed that achieves the desired quality.
411 | 
412 | === Model Equivalence
413 | 
414 | All implementations are allowed as long as the latency and accuracy bounds are
415 | met and the reference weights are used. Reference weights may be modified
416 | according to the quantization rules.
417 | 
418 | Examples of legal variance in implementations include, but are not limited to:
419 | 
420 | * Arbitrary frameworks and runtimes: TensorFlow, TensorFlow-lite, ONNX, PyTorch,
421 |   etc, provided they conform to the rest of the rules
422 | 
423 | * Running any given control flow or operations on or off an accelerator
424 | 
425 | * Arbitrary data arrangement
426 | 
427 | * Different input and in-memory representations of weights
428 | 
429 | * Variation in matrix-multiplication or convolution algorithm provided the
430 |   algorithm produces asymptotically accurate results when evaluated with
431 |   asymptotic precision
432 | 
433 | * Mathematically equivalent transformations (e.g. Tanh versus Logistic, Relu6
434 |   versus Relu8) or approximations and including but not limited to
435 |   transcendental functions (or equivalent transformations)
436 | 
437 | * Processing queries out-of-order within discretion provided by scenario
438 | 
439 | * Replacing dense operations with mathematically equivalent sparse operations
440 | 
441 | * Hand picking different numerical precisions for different operations
442 | 
443 | * Fusing or unfusing operations
444 | 
445 | * Dynamically switching between one or more batch sizes
446 | 
447 | * Different implementations based on dynamically determined batch size
448 | 
449 | * Mixture of experts combining differently quantized weights
450 | 
451 | * Stochastic quantization algorithms with seeds for reproducibility.
452 | 
453 | * Reducing ImageNet classifiers with 1001 classes to 1000 classes
454 | 
455 | For anything else you want on this list contact submitters five weeks prior to
456 | the submission deadline.
457 | 
458 | Examples of illegal variance in implementations include, but are not limited to:
459 | 
460 | * Wholesale weight replacement or supplements
461 | 
462 | * Discarding non-zero weight elements
463 | 
464 | * Caching queries or responses
465 | 
466 | * Coalescing identical queries
467 | 
468 | * Modifying weights during the timed portion of an inference run (no online
469 |   learning or related techniques)
470 | 
471 | * “Soft dropping” queries by scheduling them for execution in the indefinite
472 |   future. The latency bound enforces worst-case behavior, it is not a backdoor
473 |   for dropping 10% of queries.
474 | 
475 | * Weight quantization algorithms that are similar in size to the non-zero
476 |   weights they produce
477 | 
478 | * Hard coding the total number of queries
479 | 
480 | * Techniques that boost performance for fixed length experiments but are
481 |   inapplicable to long-running services except in the offline scenario
482 | 
483 | * Using knowledge of the LoadGen implementation to predict upcoming lulls or
484 |   spikes in the server scenario
485 | 
486 | == FAQ
487 | 
488 | Q: Do I have to use the reference implementation framework?
489 | 
490 | A: No, you can use another framework provided that it matches the reference in
491 | the required areas.
492 | 
493 | Q: Do I have to use the reference implementation scripts?
494 | 
495 | A: No, you don’t have to use the reference scripts. The reference is there to
496 | settle conformance questions - with a few exceptions, a submission to the closed
497 | division must match what the reference is doing.
498 | 
499 | Q: Why does a run require so many individual inference queries?
500 | 
501 | A: The numbers were selected to be sufficiently large to statistically verify
502 | that the system meets the latency requirements.
503 | 
504 | Q: For my submission, I am going to use a different model format (e.g., ONNX vs
505 | TensorFlow Lite).  Should the conversion routine/script be included in the
506 | submission? Or is it sufficient to submit the converted model?
507 | 
508 | A: The goal is reproducibility, so you should include the conversion
509 | routine/scripts.
510 | 
511 | ===SSD
512 | 
513 | Q: Is non-maximal suppression (NMS) timed?
514 | 
515 | A: Yes. NMS is a per image operation. NMS is used to make sure that in object
516 | detection, a particular object is identified only once. Production systems need
517 | NMS to ensure high-quality inference.
518 | 
519 | Q: Is COCO eval timed?
520 | 
521 | A: No. COCO eval compares the proposed boxes and classes in all the images
522 | against ground truth in COCO dataset. COCO eval is not possible in production.
523 | 
524 | == Appendix Number of Queries
525 | 
526 | In order to be statistically valid, a certain number of queries are necessary to
527 | verify a given latency-bound performance result. How many queries are necessary?
528 | Every query either meets the latency bound or exceeds the latency bound. The
529 | math for determining the appropriate sample size for a latency bound throughput
530 | experiment is exactly the same as determining the appropriate sample size for an
531 | electoral poll given an infinite electorate. Three variables determine the
532 | sample size: the tail latency percentage, confidence, and margin. Confidence is
533 | the probability that a latency bound is within a particular margin of the
534 | reported result.
535 | 
536 | A 99% confidence bound was somewhat arbitrarily selected. For systems with noisy
537 | latencies, it is possible to obtain better MLPerf results by cherry picking the
538 | best runs. Approximately 1 in 100 runs will be marginally better. Please don’t
539 | do this. It is very naughty and will make the MLPerf community feel sad.
540 | 
541 | The margin should be set to a value much less than the difference between the
542 | tail latency percentage and one. Conceptually, the margin ought to be small
543 | compared to the distance between the tail latency percentile and 100%. A margin
544 | of 0.5% was selected. This margin is one twentieth of the difference between the
545 | tail latency percentage and one. In the future, when the tail latency percentage
546 | rises, the margin should fall by a proportional amount. The full equation is:
547 | 
548 | Margin = (1 - TailLatency) / 20
549 | 
550 | NumQueries = NormsInv((1 - Confidence) / 2)^2 * (TailLatency)(TailLatency - 1) /
551 | Margin^2
552 | 
553 | Concretely:
554 | 
555 | NumQueries = NormsInv((1 - 0.99) / 2)^2 * (0.9)(1 - 0.9) / 0.005^2 =
556 | NormsInv(0.005)^2 * 3600 = (-2.58)^2 * 3,600 = 23,886
557 | 
558 | To keep the numbers nice, the sample sizes are rounded up. Here is a table
559 | showing proposed sample sizes for subsequent rounds of MLPerf:
560 | 
561 | |===
562 | |Tail Latency Percentile |Confidence Interval |Margin-of-Error |Inferences |Rounded Inferences
563 | |90%|99%|0.50%|23,886|3*2^13 = 24,576
564 | |95%|99%|0.25%|50,425|7*2^13 = 57,344
565 | |99%|99%|0.05%|262,742|33*2^13 = 270,336
566 | |===
567 | 
568 | These are mostly for the Server scenario which has tight bounds for tail
569 | latency. The other scenario may continue to use lower samples sizes.
570 | 


--------------------------------------------------------------------------------
/submission_rules.adoc:
--------------------------------------------------------------------------------
  1 | :toc:
  2 | :toclevels: 4
  3 | 
  4 | :sectnums:
  5 | 
  6 | # General MLPerf Submission Rules v0.2
  7 | 
  8 | :TOC:
  9 | 
 10 | 
 11 | 
 12 | ## Basics
 13 | 
 14 | These rules describe the submission, review, and publication process for all MLPerf benchmark suites. There are separate rules that govern what to submit for each MLPerf benchmark suite, including:
 15 | 
 16 | * https://github.com/mlperf/training_policies/blob/master/training_rules.adoc[Training Rules]
 17 | 
 18 | * https://github.com/mlperf/inference_policies/blob/master/inference_rules.adoc[Inference Rules]
 19 | 
 20 | Unless otherwise stated all rules apply to closed and open.
 21 | 
 22 | ## Review committee
 23 | 
 24 | The MLPerf submission, review, and publication process is overseen by a review committee. 
 25 | 
 26 | 
 27 | ### Structure
 28 | 
 29 | The review committee consists of the following people:
 30 | 
 31 | 
 32 | 
 33 | *   The chair or chairs of the following working groups:
 34 |     **   [ Inference or training, whichever is being submitted ] results
 35 |     **   [ Inference or training, whichever is being submitted ] submitters
 36 |     **   [ Inference or training, whichever is being submitted ] special topics
 37 |     **   Power
 38 | *   Until such time a permanent host organization exists:
 39 |     **   The general chair
 40 | *   When a permanent host organization exists:
 41 |     **   The executive director
 42 |     **   The chairperson of the board
 43 | *   For v0.5 only, and only for the review process: One representative of each accelerator vendor (as opposed an OEM or system vendor) who submitted and is not otherwise represented above, provide that in the opinion of the majority of the above said representative is a significant contributor to MLPerf through active participation in one or more working groups.
 44 | 
 45 | If representatives of a single organization, its parents, subsidiaries, affiliates, or contractors hold multiple positions on this list, only one such representative is eligible to serve on the review committee. 
 46 | 
 47 | The review committee is chaired by the relevant results working group chair, e.g. training results working chair during a training submission review. For this reason, the results working groups may never have multiple chairs. 
 48 | 
 49 | 
 50 | ### Meetings
 51 | 
 52 | The review committee makes decisions during properly scheduled meetings. The base meeting schedule is dictated by the review process. The review committee chair may schedule additional meetings during the review process in the course of any meeting or by emailing all committee members (e.g. mlperf-chairs mailing list) at least 24 hours in advance.
 53 | 
 54 | The review committee has a quorum if it includes an eligible chair and at four other eligible members.
 55 | 
 56 | Review committee meetings are typically held at a specific location determined by the review committee chair and in-person attendance is encouraged, but virtual attendance is to be enabled and allowed. 
 57 | 
 58 | Review committee meetings are open only to the review committee and all submitters in the current submission round.
 59 | 
 60 | 
 61 | ### Agenda and decisions
 62 | 
 63 | The review committee agenda is set by, and decisions are tracked, using issues filed against the Github submission repo. In general, issues must be filed as dictated by the review process schedule. Exceptions are discouraged but may be allowed during a meeting by a vote of the review committee. In general, issues are decided in the order filed, but the chair may choose a different order if circumstances warrant.
 64 | 
 65 | The review committee should attempt to decide issues through discussion and grudging consensus whenever possible. However, if the review committee and all submitters are unable to reach a grudging consensus, the review committee will vote to decide the issue.
 66 | 
 67 | Review committee votes are determined by a majority of eligible review committee members attending a meeting. In the event of a tie, the committee chair has the casting vote. Votes are initiated by the chair, and are cast openly and may be cast verbally or using a shared spreadsheet or other voting software.
 68 | 
 69 | The review committee operates on balance of interests rather than by avoiding conflict of interest. Members may cast votes on all matters, including those directly affecting benchmark submissions made by their organization, as a practical response to the fact that competitors are also on the review committee.
 70 | 
 71 | 
 72 | ### Confidential and not precedent setting 
 73 | 
 74 | Because the submission round is confidential to the submitters, the review committee agenda, deliberations, and specific decisions are confidential and shared only with committee members and submitters for that round. The general nature of decisions may be shared outside the review process because such decisions may expose the need for rules changes. 
 75 | 
 76 | The private submission repo will be deleted when the next MLPerf submission repo is created, or after 90 days.
 77 | 
 78 | Review committee decisions do not create precedents. Instead, the decisions should be explicitly incorporated into the rules through the normal process. 
 79 | 
 80 | 
 81 | ## Operating principles
 82 | 
 83 | MLPerf’s purpose is to produce fair and useful benchmark results.
 84 | 
 85 | The MLPerf review committee reserves the right to amend these rules and/or exclude submissions that conflict with this purpose with a two-thirds (rounded up) vote. For instance, if the schedule is discovered to be untenable in practice, it may be amended. If a submission is judged to be deceptive or not of interest to the community, it may be excluded. 
 86 | 
 87 | The role of the review process is to ensure fairness of submissions, not to litigate details in an effort to disquality competitors. For example:
 88 | 
 89 | 
 90 | 
 91 | *   Reviewing submitters should discuss issues with owning submitters after filing objections, and attempt to resolve the issue if possible.
 92 | *   If an objection is supported by the review committee, the objecting submitter should communicate with the owning submitter to ensure a satisfactory fix. 
 93 | *   Issues in submission that are agreed to require correction, but that do not meaningfully impact performance (less than 2% cumulative performance difference) or competitive ordering may be waived by the review committee, subject to its discretion, and with the understanding that the submitter will correct the issue in future submissions.
 94 | 
 95 | 
 96 | ## Schedule
 97 | 
 98 | MLPerf has several submission rounds each year. Each submission round follows a detailed schedule.
 99 | 
100 | 
101 | ### Schedule of submission rounds
102 | 
103 | The submission schedule is to be set yearly, and must be approved by both the inference and training submitters meetings. The following is the remaining 2019 submission schedule.
104 | 
105 | 
106 | |===
107 | | Submission round | Submission date
108 | | Inference v0.5 | October 11th
109 | |===
110 | 
111 | 
112 | The following is the draft 2020 submission schedule:
113 | 
114 | 
115 | |===
116 | | Submission round | Submission date
117 | | Training v0.7 | June 26
118 | | Inference v0.7 | September 4
119 | | Training v0.8 | November 13 [tentative]
120 | |===
121 | 
122 | 
123 | ### Single submission round schedule
124 | 
125 | Each submission round has the following detailed schedule, which has three major phases:
126 | 
127 | 
128 | 
129 | . Submission
130 | . Review
131 | .. Objection filing
132 | .. Objection review
133 | .. Objection revision
134 | . Publication
135 | 
136 | Each of these phases is described in more detail later in this document.
137 | 
138 | 
139 | |===
140 | | Day | Meeting or deadline (all deadlines are 11:59pm San Jose unless otherwise specified)
141 | | *Week -2* | *Presubmission*  
142 | | Monday | 
143 | | Tuesday | 
144 | | Wednesday | Submitters must sign CLA and provide primary and secondary POCs with Github handles and email addresses
145 | | Thursday | 
146 | | Friday | Submitters WG chair creates submission repo. Gives all submitters access. Sends submitter POCs test email requesting they make a test submission to confirm access.
147 | | *Week -1* | *Presubmission*
148 | | Monday | 
149 | | Tuesday | 
150 | | Wednesday | 
151 | | Thursday | 
152 | | Friday | All “due in advance” writeups due (e.g. for inference calibration / weight transformation)
153 | |  | Submitters WG chair distributes random seed(s) for load generation (inference only) 
154 | | *Week 0* | *Submission*
155 | | Monday | 
156 | | Tuesday | 
157 | | Wednesday | 
158 | | Thursday | Last opportunity to notify chair that you will not submit
159 | | Friday | 1:00pm San Jose: Submit all required artifacts to the Github repo
160 | |  | 1:30pm San Jose: Results summary distributed by the Submitters working group chair
161 | | *Week 1* | *Review: objection filing*
162 | | Monday | Begin drafting neutral press release [general chair until org, then executive director]
163 | | Tuesday | Review committee meeting, discuss objections
164 | | Wednesday | 
165 | | Thursday | Review committee meeting, discuss objections
166 | | Friday | Objections due in Github, audit results due in GitHub for open and closed
167 | | *Week 2* | *Review: objection review* 
168 | | Monday | Submitter response to objections
169 | | Tuesday | Review committee meeting, makes easy decisions and requests information about difficult ones
170 | | Wednesday | Requested information due
171 | |  | Distribute neutral press release for comment by [general chair until org, then executive director]
172 | | Thursday | Review committee meeting, makes any remaining decisions
173 | | Friday | 
174 | | *Week 3* | *Review: objection revision*
175 | | Monday | Must declare all intended hyperparameter borrowing (training only)
176 | | Tuesday | Review committee meeting, discusses any fixes and borrowing
177 | | Wednesday | Final code due
178 | | Thursday | Review committee meeting, decides to approve/reject fixes if required
179 | |  | Approve final draft of press release"
180 | | Friday | 1:00pm San Jose: Final results in human readable form due
181 | |  | 1:00pm San Jose: Final opportunity to withdraw some or all results
182 | |  | 1:30pm San Jose: Results summary distributed by Submitters WG chair
183 | | *Week 4* | *Publication*
184 | | Monday | Press and analyst pre-briefings allowed under embargo, all briefings to include neutral press release
185 | |  | 1:00pm San Jose: Draft of results page available for comment
186 | | Tuesday | 1:00pm San Jose: Corrections to results page due
187 | |  | 5:00pm San Jose: Results page and press release live on staging site
188 | | Wednesday | 10:00am San Jose: results and PR public, press embargo ends
189 | |===
190 | 
191 | ## Submission 
192 | 
193 | The submission process defines how to submit code and results for review and eventual publication.
194 | 
195 | 
196 | ### Registration
197 | 
198 | Submitters must register with the submitters working group and begin attending meetings at least **eight weeks before the deadline. **In order to register, a submitter or their org must sign the relevant CLA and provide primary and secondary github handles and primary and secondary POC email address.
199 | 
200 | 
201 | ### Github repo
202 | 
203 | MLPerf will provide a private Github repository for submissions. Each submitter will submit one or more pull requests containing their submission to the appropriate Github repo before the submission deadline. Pull requests may be amended up until the deadline. 
204 | 
205 | 
206 | ### Licensing
207 | 
208 | All submissions of code must be made under the MLPerf CLA, which is temporarily the Google open source CLA. Per the CLA, all submissions of code will be Apache 2 compatible. Third party libraries need not be Apache 2 licensed.
209 | 
210 | ### Verification
211 | 
212 | A submission must pass the package checker script and the result summarizer script must be capable of extracting the correct results. Specifically, the following commands must not generate errors: 
213 | 
214 | #### Training
215 | ----
216 | python3 -m mlperf_logging.package_checker <YOUR SUBMISSION_FOLDER> training 0.7.0
217 | python3 -m mlperf_logging.result_summarizer <YOUR SUBMISSION_FOLDER> training 0.7.0
218 | ----
219 | 
220 | #### Inference
221 | ----
222 | # from the top of the mlperf inference repository
223 | python3 tools/submission/submission-checker.py --input <YOUR_SUBMISSION_FOLDER> --submitter <YOUR_ORGANIZATION>
224 | ----
225 | 
226 | ### Submission content
227 | 
228 | A submission must contain the following:
229 | 
230 | 
231 | 
232 | *   Metadata for the systems under test
233 | *   Code that implements the benchmarks
234 | *   Metadata that describes each system-implementation combination tested
235 | *   Scripts that setup and execute each system-implementation tested
236 | *   Result logs for each system-implementation tested
237 | 
238 | 
239 | ### Directory structure
240 | 
241 | A submission is for one code base for the benchmarks submitted. An org may make multiple submissions. A submission should take the form of a directory with the following structure. The structure must be followed regardless of the actual location of the actual code, e.g. in the MLPerf repo or an external code host site. 
242 | 
243 | 
244 | #### Training
245 | 
246 | * <submitting_organization>/
247 | ** systems/
248 | *** <system_desc_id>.json
249 | ** benchmarks/
250 | *** <benchmark_name per reference>/ [TODO: rename the reference directories]
251 | **** implementations/
252 | ***** <implementation_id>/
253 | ****** <arbitrary stuff>
254 | ***** <system_desc_id>/
255 | ****** <system_desc_id>_<implementation_id>.json
256 | ****** README.md
257 | ****** setup.sh (one-time configuration script)
258 | ****** init_datasets.sh (one-time dataset init script)
259 | ****** run_and_time.sh (run the benchmark and produce a result)
260 | ** results/
261 | *** <system_desc_id>/
262 | **** <benchmark>/
263 | ***** result_<i>.txt   # log file
264 | 
265 | System names and implementation names may be arbitrary. 
266 | 
267 | Training benchmark directory names must be one of  { **resnet, ssd, maskrcnn, transformer, gnmt, ncf, minigo **}.
268 | 
269 | 
270 | #### Inference
271 | 
272 | * <submitting_organization>/
273 | ** systems/
274 | *** <system_desc_id>.json   # combines hardware and software stack information
275 | ** code/
276 | *** <benchmark_name per reference>/ 
277 | **** <implementation_id>/
278 | ***** <Code interface with loadgen and other arbitrary stuff>
279 | ** measurements/
280 | *** <system_desc_id>/
281 | **** <benchmark>/
282 | ***** <scenario>
283 | ****** <system_desc_id>_<implementation_id>_<scenario>.json
284 | ****** README.md
285 | ****** user.conf
286 | ****** mlperf.conf
287 | ****** calibration_process.adoc
288 | ** results/
289 | *** <system_desc_id>/
290 | **** <benchmark>/
291 | ***** <scenario>
292 | ****** performance/
293 | ******* run_x/ # 5 runs for Server scenario, 1 run for the other scenarios
294 | ******** mlperf_log_summary.txt
295 | ******** mlperf_log_detail.txt
296 | ****** accuracy/
297 | ******* mlperf_log_summary.txt
298 | ******* mlperf_log_detail.txt
299 | ******* mlperf_log_accuracy.json # truncated by truncate_accuracy_log.py if too large
300 | ******* accuracy.txt # stdout of reference accuracy scripts
301 | *** compliance_checker_log.txt
302 | ** compliance/
303 | *** <system_desc_id>/
304 | **** <benchmark>/
305 | ***** <scenario>
306 | ****** <test_id>
307 | ******* performance/
308 | ******** run_1/ # 1 run for every scenario
309 | ********* mlperf_log_summary.txt
310 | ********* mlperf_log_detail.txt
311 | ******* accuracy/
312 | ******** accuracy.txt # for TEST01 only, generated from truncate_accuracy_log.py
313 | ******** mlperf_log_accuracy.json # only necessary for TEST01
314 | ******** baseline_accuracy.txt # only for TEST01 if accuracy check fails
315 | ******** compliance_accuracy.txt # only for TEST01 if accuracy check fails
316 | ******* verify_performance.txt
317 | ******* verify_accuracy.txt # for TEST01 only
318 | 
319 | 
320 | System names and implementation names may be arbitrary. 
321 | 
322 | <benchmark> must be one of {**resnet50, ssd-mobilenet, ssd-resnet34, rnnt, bert-99, bert-99.9, dlrm-99, dlrm-99.9, 3d-unet-99, 3d-unet-99.9**}. The postfix '-99' and '-99.9' indicate that the accuracy must be >= 99% or 99.9% of the target accuracy. 
323 | 
324 | <scenario> must be one of {**Offline, Server, SingleStream, MultiStream**}. 
325 | 
326 | <test_id> must be one of {**TEST01, TEST04-A, TEST04-B, TEST05**}.
327 | 
328 | Here is the list of mandatory files for all submissions in any division/category. However, your submission should still include all software information and related information for results replication. 
329 | 
330 | *   mlperf_log_summary.txt
331 | *   mlperf_log_detail.txt
332 | *   mlperf_log_accuracy.json
333 | *   user.conf
334 | *   calibration or weight transformation related code if the original MLPerf models are not used
335 | *   actual models if the models are not deterministically generated
336 | *   READMEs to enable users to replicate performance results
337 | *   code which interfaces with the loadgen 
338 | *   <system_desc_id>_<implementation_id>_<scenario>.json
339 | *   <system_desc_id>.json
340 | *   compliance_checker_log.txt
341 | 
342 | For some models mlperf_log_accuracy.json can get very large. Because of this we truncate mlperf_log_accuracy.log in submissions
343 | using a tool.
344 | A submiter will run the tool before submitting to mlperf and ***keep*** the original mlperf_log_accuracy.log files inside their organization.
345 | The original files might be requested by mlperf during submission review so you need to store them.
346 | Run the tool as follows, assuming <SOURCE> is your local subumission tree and <DEST> the location of the github submission repo:
347 | 
348 | ```
349 | # from top of the inference source tree
350 | python3 tools/submission/truncate_accuracy_log.py --input <SOURCE> --output <DEST>
351 | ```
352 | 
353 | ### <system_desc_id>.json metadata
354 | 
355 | The file <system_desc_id>.json should contain the following metadata describing the system:
356 | |===
357 | | Field | Meaningful response required | Cloud example | On-premise example1 | On-premise example2
358 | | submitter | Yes | Google | David Kanter | David Kanter
359 | | division | Yes | closed | Closed | Open
360 | | system_type | Yes | datacenter | datacenter | edge
361 | | status | Yes | available | available | available
362 | |  |  |  |  |
363 | | system_name | Yes | tpu-v3 | 8ball | 8ball
364 | | number_of_nodes | Yes | 1 | 1 | 1
365 | | host_processors_per_node | Yes | 1 | 2 | 2
366 | | host_processor_model_name | Yes | Intel Skylake | Intel Xeon Platinum 8164 | Intel Xeon Platinum 8164 
367 | | host_processor_core_count | Yes, or vcpu |  | 26 | 26
368 | | host_processor_vcpu_count | Yes, or core | 96 | |
369 | | host_processor_frequency |  |  | 2000MHz | 2000MHz
370 | | host_processor_caches |  |  | L1: 32KB I + 32KB D per core, L2: 1MB I+D per core, L3: 37.75MB I+D per chip | L1: 32KB I + 32KB D per core, L2: 1MB I+D per core, L3: 37.75MB I+D per chip
371 | | host_processor_interconnect |  |  | 3x 10.6GT/s UPI | 3x 10.6GT/s UPI
372 | | host_memory_capacity | Yes | 128GB | 384GB | 384GB
373 | | host_storage_type | Yes | SSD | SSD | SSD
374 | | host_storage_capacity | Yes | 1 200 GB + 1 50 GB | 800GB | 800GB
375 | | host_networking |  |  | N/A | N/A
376 | | host_networking_topology |  |  | N/A | N/A
377 | | host_memory_configuration |  |  | 12 x 32GB 2Rx4 PC4-2666V-R | 12 x 32GB 2Rx4 PC4-2666V-R
378 | | accelerators_per_node | Yes | 16 | 4 | 4
379 | | accelerator_model_name | Yes | tpu-v3 | Nvidia Tesla V100 | Nvidia Tesla V100
380 | | accelerator_host_interconnect |  |  | PCIe 3.0 x16 | PCIe 3.0 x16
381 | | accelerator_frequency |  |  | 1230MHz | 1230MHz 
382 | | accelerator_on-chip_memories |  |  | L1: 80x 128KB, L2: 6MB per chip | L1: 80x 128KB, L2: 6MB per chip 
383 | | accelerator_memory_configuration | Yes | HBM | HBM2 | HBM2
384 | | accelerator_memory_capacity | Yes | 32 GB | 32GB | 32GB
385 | | accelerator_interconnect |  |  | 6x 25GT/s NVLink | 6x 25GT/s NVLink
386 | | accelerator_interconnect_topology |  |  | Direct | Mesh
387 | | cooling |  |  | Liquid | Air-cooled
388 | | hw_notes |  |  | I overclocked it! | Miscellaneous notes
389 | |  |  |  | | 
390 | | framework | Yes | TensorFlow 1.14 commit hash = faf9db515c4bf550daacc1c3a22fedf3ff5dde63 | PyTorch, NGC19.05 | PyTorch, NGC19.05
391 | | other_software_stack | Yes | TPU stack 1.14.1.dev20190518, python 3.6, sacrebleu 1.2.11 | cuda 10.2.0.163, cudnn 7.6.0.64, cublas 10.2.0.163, gcc 5.4.0 | cuda 10.2.0.163, cudnn 7.6.0.64, cublas 10.2.0.163, gcc 5.4.0 
392 | | operating_system | Yes | Ubuntu 16.04 | Ubuntu 18.04.1 LTS | Ubuntu 18.04.1 LTS
393 | | sw_notes |  |  | extra notes here | extra notes here
394 | |===
395 | 
396 | 
397 | ### <system_desc_id>_<implementation_id>_<scenario>.json metadata
398 | 
399 | The file <system_desc_id>_<implementation_id>.json should metadata describing use of the specified implementation on the specified system.
400 | 
401 | 
402 | |===
403 | | Field | Meaningful response required | DK_Example_1 | DK_Example_2
404 | | Starting weights filename? | Yes | https://zenodo.org/record/2269307/files/mobilenet_v1_1.0_224.tgz | https://zenodo.org/record/2269307/files/mobilenet_v1_1.0_224.tgz
405 | | Weight transformations? | Yes | No | Yes (URL_to_calibration_writeup)
406 | | Weight data type(s) | Yes | fp32 | bf16
407 | | Input data type(s) | Yes | fp32 | bf16
408 | | Retraining | Yes | No | Yes (URL_to_writeup)
409 | |===
410 | 
411 | 
412 | ### Logging requirements
413 | 
414 | For Training, the results logs must be verified and stamped by the training log verification script [TODO log]. The easiest way to produce such a log is to use the 
415 | 
416 | For Inference, the results logs must have been produced by the [standard load generator](https://github.com/mlperf/inference/tree/master/loadgen). Power information may be appended using the standard power information appending script [TODO link or remove].
417 | 
418 | 
419 | ### Source code requirements for replication
420 | 
421 | The following section applies to all submissions in all divisions.
422 | 
423 | The source code must be sufficient to reproduce the results of the submission, given all source components specified. Any software component that would be required to substantially reproduce the submission must be uniquely identified using one of the following methods:
424 | 
425 | 
426 | |===
427 | | Software Component | Possible methods for replication | Considered “Available” for Category purposes (see later section)
428 | | Source code or binary included in the submission repo | --- | Yes
429 | | Depends only on public Github repo | Commit hash or tag | Yes
430 | | Depends only on public Github repo plus one or more PRs | Commit hash or tag, and PR number(s) | Yes
431 | | Depends only on an available binary (could be free to download or for purchase / customers only) | Name and version, or url | Yes, if the binary is a Beta or Production release
432 | | Depends on private source code from an internal source control system | Unique source identifier [i.e., gitlab hash, p4 CL, etc] | No
433 | | Private binary | Checksum | No
434 | |===
435 | 
436 | 
437 | ### Source code requirements for inference inspection
438 | 
439 | The following section applies to all submissions in the Closed division. We encourage Open division submissions to be as transparent as possible. We will re-examine in v0.6.
440 | 
441 | For inference, the source code, pseudo-code, or prose description must be sufficient to determine:
442 | 
443 | 
444 | 
445 | *   The connection to the loadgen
446 | *   Preprocessing
447 | *   The architecture of the model, and the operations performed
448 | *   Weights (please notify results chair if > 2 GB combined)
449 | *   Weight transformations
450 | **   If weight transformations are non-deterministic, then any randomness seeds used must be included in the submission.
451 | 
452 | For the inference server scenario, the source code, pseudo-code, or prose must be sufficient to determine:
453 | 
454 | 
455 | 
456 | *   Online batching, meaning how the server batches queries for processing
457 | 
458 | 
459 | ### Source code requirements for training inspection
460 | 
461 | For training, the source code must be sufficient to verify all aspects of a Closed submission including but not limited to:
462 | * Data preprocessing
463 | * Data traversal order
464 | * Model 
465 | * Model initialization
466 | * Optimizer used
467 | * Hyperparameters used
468 | * Evaluation frequency
469 | * Evaluation method
470 | 
471 | This requirement applies even to Open submissions, though the aspects do not need to match the reference.
472 | 
473 | ### Compliance Testing
474 | 
475 | Submitters must run the compliance tests to verify that their submission achieves a basic level of compliance with a subset of the MLPerf rules. If compliance testing identifies a potential issue with the submission, the onus is on the submitter to provide an adequate explanation to the working group.
476 | 
477 | #### Training
478 | 
479 | This section in progress [TODO].
480 | 
481 | #### Inference
482 | 
483 | Refer to the documentation found under https://github.com/mlperf/inference/tree/master/compliance/nvidia
484 | 
485 | 
486 | ## Review
487 | 
488 | 
489 | ### Visibility of results and code during review
490 | 
491 | During the review process, only certain groups are allowed to inspect results and code. 
492 | 
493 | 
494 | |===
495 | | Group | Can Inspect
496 | | Review committee | All results, all code
497 | | Submitters | All results, all code
498 | | Public | No results, no code
499 | |===
500 | 
501 | ### Required reviews
502 | 
503 | Each submitter is required to review at least one other submission. Required reviews are assigned as follows:
504 | 
505 | . Stack rank submissions by number of results.
506 | . Assign reviewers in pairs walking down the stack rank
507 | . If an odd number of reviewers, the bottom 3 in the stack rank will review each other.
508 | 
509 | ### Auditing
510 | 
511 | TBD
512 | 
513 | 
514 | ### Filing objections
515 | 
516 | Submitters must officially file objections to other submitter’s code by creating a GitHub issue prior to the “Filing objections” deadline that cites the offending lines, the rules section violated, and, if pertinent, corresponding lines of the reference implementation that are not equivalent.
517 | 
518 | Each submitter must file objections with a “by <org>” tag and a “against <org>” tag. Multiple organizations may append their “by <org>” to an existing objection if desired. If an objector comes to believe the objection is in error they may remove their “by <org>” tag. All objections with no “by <org>” tags at the end of the filing deadline will be closed.
519 | 
520 | Submitters should file an objection, then discuss with the submitter to verify if the objection is correct. Following filing of an issue but before resolution, both objecting submitter and owning submitter may add comments to help the review committee understand the problem. 
521 | 
522 | If the owning submitter acknowledges the problem, they may append the “fix_required” tag and begin to fix the issue.
523 | 
524 | 
525 | ### Resolving objections
526 | 
527 | The review committee will review each objection, and either establish consensus or vote. If the committee votes to support an objection, it will provide some basic guidance on an acceptable fix and append the “fix_required” tag. If the committee votes against an objection, it will close the issue.
528 | 
529 | 
530 | ### Fixing objections
531 | 
532 | Code should be updated via a pull request prior to the “fixing objections” deadline. Following submission of all fixes, the objecting submitter should confirm that the objection has been addressed with the objector(s) and ask them to remove their “by <org> tags.
533 | 
534 | If the objector is not satisfied by the fix, then the review committee will decide the issue at its final review meeting. The review committee may vote to accept a fix and close the issue, or reject a fix and request the submission be moved to open or withdrawn. 
535 | 
536 | 
537 | ### Hyperparameter borrowing (training only)
538 | 
539 | Hyperparameters may be updated in accordance with the training rules prior to the final code due date.
540 | 
541 | 
542 | ### Withdrawing results or changing division
543 | 
544 | Anytime up until the final human readable deadline, an entry may be withdrawn by amending the pull request. Alternatively, an entry may be voluntarily moved from the closed division to the open division.
545 | 
546 | 
547 | ## Publication 
548 | 
549 | MLPerf will publish all results simultaneously via an update to the results page. After publication, code and results are public and free for use under the MLPerf Terms of Use.
550 | 
551 | 
552 | ### Results tables
553 | 
554 | There will be two results table published, one for Closed and one for Open.
555 | 
556 | 
557 | ### Results table content
558 | 
559 | Each results table will contain the following information: 
560 | 
561 | 
562 | |===
563 | | Field | Description
564 | | TBD | TBD
565 | |===
566 | 
567 | 
568 | ### Results categories
569 | 
570 | Results will be divided into categories based on the availability of the hardware and software components
571 | 
572 | 
573 | |===
574 | | Category | Hardware | Software
575 | | Available in cloud | Available for rent in the cloud | Available
576 | | Available on premise | Available for purchase | Available
577 | | Preview | Must be available for rent or purchase in time for the next submission or within 180 days whichever is longer | Available except for software required to support substantially new hardware
578 | | Research, Development, or Internal | Does not meet the above requirements | Does not meet the above requirements
579 | |===
580 | 
581 | 
582 | #### Available Systems
583 | 
584 | _Available_ cloud systems must (1) have available pricing (either publicly advertised or available by request), (2) have been rented by at least one third party, (3) have public evidence of availability (web page saying product is available, statement by company, etc), and (4) be “reasonably available” for rent by additional third parties by the submission date. 
585 | 
586 | An on-premise system is _Available_ if all of its components that substantially determine ML performance are _Available_ either individually or in aggregate (development boards that meet the substantially determine clause are allowed). An _Available_ component or system must (1) have available pricing (either publicly advertised or available by request), (2) have been shipped to at least one third party, (3) have public evidence of availability (web page saying product is available, statement by company, etc), and (4) be “reasonably available” for purchase by additional third parties by the submission date.  In addition, submissions for on-premise systems must describe the system and its components in sufficient detail to enable third parties to build a similar system. 
587 | 
588 | In both cases, “reasonably available” means:
589 | 
590 | 
591 | 
592 | 1. Supply and lead times are appropriate for system scale, i.e. on-demand and in quantity for the smallest systems and a few months and with limited supply for the largest systems.
593 | 2. Access to rent or purchase may be subject to conditions that are common to generally available products (such as financial qualifications, size of customer, support burden, export restrictions, etc.) but is not otherwise restricted (i.e. no “early access” approval requirements).
594 | 
595 | However, it is allowed for the qualifying pre-submission rentals/purchases to have been made with restrictions such as “early access” approval.
596 | 
597 | _Available_ systems must use an _Available_ software stack. A software stack consists of the set of software components that substantially determine ML performance but are not in the uploaded source code. For instance, for training this includes at a minimum any required ML framework (e.g. TensorFlow, pyTorch) and ML accelerator library (e.g. cuDNN, MKL). An _Available_ software stack consists of only _Available_ software components.
598 | 
599 | An Available software component must be well supported for general use. For open source software, you must base the software on any commit in an "official" repo plus a PR to support a particular architecture. For binaries, the binary must be made available as release, or as a "beta" release with the requirement that optimizations will be included in a future "official" release. The beta must be made available to customers as a clear part of the release sequence. The software must be available at the time of submission.
600 | 
601 | 
602 | #### Preview Systems
603 | 
604 | A _Preview_ system is a system which will meet the requirements of an _Available_ system within 180 days of the submission date, or by the next MLPerf submission date, whichever is more, and which the submitter commits to submitting as an _Available_ system by that time. If it is not submitted in that submission round with equal or better performance (allowing for noise), the _Preview_ submission will be marked as invalid. Systems are exempt from this requirement if the submitted benchmarks are retired or changed to such a degree as no longer reasonably runnable on that system.
605 | 
606 |  
607 | 
608 | If a _Preview_ system contains a newly developed hardware component (e.g. a new ML accelerator) that is a substantial contributor to the determination of ML performance, then for that submission only, the “Available software stack” requirement is waived for software that is necessary to support that component. Otherwise, _Preview_ systems must meet the same _Available_ software stack requirements as an _Available_ system. For example, the first shipping version of a new accelerator need not meet the _Available _software stack requirements, but subsequent SKUs of that accelerator are not considered newly developed, and must meet _Available_ software stack requirements.
609 | 
610 | 
611 | #### Research, Development, or Internal Systems
612 | 
613 | A research, development, or internal (RDI) component  does not meet the requirements for an available or preview component. An RDI system is a system containing one or more RDI components. The RDI components may not be submitted as _Available_ components  until the submission cycle after next or 181 days whichever is longer
614 | 
615 | 
616 | ## After publication
617 | 
618 | 
619 | ### Terms of use
620 | 
621 | Any use of published results in connection with the MLPerf trademark must follow the [terms of use.](https://github.com/mlperf/policies/blob/master/TERMS%20OF%20USE.md)
622 | 
623 | 
624 | ### Issues discovered after publication
625 | 
626 | If a substantial issue (>5% cumulative change) with a closed division result is discovered after publication and confirmed by the review committee, the result may be fixed if possible in a two week timeframe, otherwise moved to the open division if possible, or marked non-compliant if necessary.
627 | 
628 | 
629 | ## Appendices
630 | 
631 | The appendices contain additional information.
632 | 
633 | 
634 | ### Committee non-disclosure
635 | 
636 | This section in progress [TODO].
637 | 
638 | 
639 | ### Submitter non-disclosure
640 | 
641 | This section in progress [TODO].
642 | 
643 | 
644 | ### Submission checklist
645 | 
646 | This section in progress [TODO].
647 | 
648 | 
649 | ### Power
650 | 
651 | This section in progress [TODO].
652 | 
653 | 
654 | ### Review chair checklist
655 | 
656 | This section in progress [TODO].
657 | 
658 | 


--------------------------------------------------------------------------------