├── README.md ├── analysis └── analysis_goes_here.MD └── efficiency_sota.csv /README.md: -------------------------------------------------------------------------------- 1 | # Algorithmic Efficiency SOTA Submissions 2 | We found that in 2019 it took [44x less compute](https://openai.com/blog/ai-and-efficiency/) to train a neural net to AlexNet-level performance than in 2012. 3 | (Moore’s Law would have only yielded an 11x change in cost over this period). 4 | 5 | Going forward, we're going to use this git repository to help publicly track state of the art (SOTA) algorithmic efficiency. 6 | We're beginning by tracking training efficiency SOTA's in image recognition and translation at two levels. 7 | 8 | #### AlexNet-level performance 9 | *79.1% top 5 accuracy on ImageNet* 10 | 11 | | Publication| Compute(tfs-s/days)| Reduction Factor| Analysis| Date | 12 | | ----------------------- | ------------- | ------------ | ----------------------- | ------------ | 13 | | [AlexNet](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)|3.1|1|AI and Efficiency| 6/1/2012| 14 | | [GoogLeNet](https://arxiv.org/abs/1409.4842)|0.71|4.3|[AI and Efficiency](https://openai.com/blog/ai-and-efficiency/)| 9/17/2014| 15 | | [MobileNet](https://arxiv.org/abs/1704.04861)|0.28|11|[AI and Efficiency](https://openai.com/blog/ai-and-efficiency/)| 4/17/2017| 16 | | [ShuffeNet](https://arxiv.org/abs/1707.01083)|0.15|21|[AI and Efficiency](https://openai.com/blog/ai-and-efficiency/)| 7/3/2017| 17 | | [ShuffleNet_v2](https://arxiv.org/abs/1807.11164)|0.12|25|[AI and Efficiency](https://openai.com/blog/ai-and-efficiency/)| 6/30/2018| 18 | | [EfficientNet](https://arxiv.org/abs/1905.11946)|0.069|44|[EfficientNet](https://arxiv.org/abs/1905.11946)| 5/28/2019| 19 | 20 | #### ResNet-50-level performance 21 | *92.9% top 5 accuracy on ImageNet* 22 | 23 | | Publication| Compute(tfs-s/days)| Reduction Factor| Analysis| Date | 24 | | ----------------------- | ------------- | ------------ | ----------------------- | ------------ | 25 | |[ResNet-50](https://arxiv.org/abs/1512.03385)|17|1|[AI and Efficiency](https://openai.com/blog/ai-and-efficiency/)| 1/10/2015| 26 | |[EfficientNet](https://arxiv.org/abs/1905.11946)|0.75|10|[EfficientNet](https://arxiv.org/abs/1905.11946)| 5/28/2019| 27 | 28 | #### Seq2Seq-level Performance 29 | *34.8 BLEU on WMT-14 EN-FR* 30 | 31 | | Publication| Compute(tfs-s/days)| Reduction Factor| Analysis| Date | 32 | | ----------------------- | ------------- | ------------ | ----------------------- | ------------ | 33 | [Seq2Seq (Ensemble)](https://arxiv.org/abs/1409.3215)|465|1|[AI and Compute](https://openai.com/blog/ai-and-compute/)| 1/10/2014 34 | [Transformer(Base)](https://arxiv.org/abs/1706.03762)|8|61|[Attention is all you need](https://arxiv.org/abs/1807.11164)| 1/12/2017 35 | 36 | #### GNMT-level performance 37 | *39.92 BLEU on WMT-14 EN-FR* 38 | 39 | | Publication| Compute(tfs-s/days)| Reduction Factor| Analysis| Date | 40 | | ----------------------- | ------------- | ------------ | ----------------------- | ------------ | 41 | [GNMT](https://arxiv.org/abs/1609.08144)|1620|1|[Attention is all you need](https://arxiv.org/abs/1807.11164)| 1/26/2016 42 | [Transformer (Big)](https://arxiv.org/abs/1706.03762)|181|9|[Attention is all you need](https://arxiv.org/abs/1807.11164)| 1/12/2017 43 | 44 | ## In order to make an entry please submit a pull request in which you: 45 | 1. Make the appropriate update to efficiency_sota.csv 46 | 2. Make the appropriate update to the tables in this file, README.MD 47 | 3. Add the relevant calculations/supporting information to the analysis folder. To get examples of calculations please see 48 | [AI and Compute](https://openai.com/blog/ai-and-compute/#appendixmethods) and Appendix A and B in [Measuring the Algorithmic Efficiency of Neural Networks](https://arxiv.org/abs/2005.04305). 49 | 50 | FAQ 51 | 1. We're interested in tracking progress on additional benchmarks that have been of interest for many years and continue 52 | to be of interest. Please send thoughts or analysis on such benchmarks to *danny@openai.com.* 53 | 2. ImageNet is the only training data source allowed for the vision benchmark. No human captioning, other images, or other data is allowed. Automated augmentation is ok. 54 | 3. We currently place no restrictions on training data used for translation, but may split results by appropriate categories in the future. 55 | 4. A tf-s/day equals a teraflop/s worth of compute run a day. 56 | 57 | To cite this work please use the following bibtex entry. 58 | ``` 59 | @misc{hernandez2020efficiency 60 | title = {Measuring the Algorithmic Efficiency of Neural Networks}, 61 | author = {Danny Hernandez, Tom B. Brown}, 62 | year = {2020}, 63 | eprint={2005.04305}, 64 | archivePrefix={arXiv}, 65 | primaryClass={cs.LG}, 66 | } 67 | ``` 68 | -------------------------------------------------------------------------------- /analysis/analysis_goes_here.MD: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/openai/ai-and-efficiency/1f280d11be9d3ef2b21f4f2317786233bda288ab/analysis/analysis_goes_here.MD -------------------------------------------------------------------------------- /efficiency_sota.csv: -------------------------------------------------------------------------------- 1 | Metric,Publication Date,Publication,Publication link,Compute (teraflops-s/days),Reduction Factor,Analysis,Analysis link 2 | AlexNet,6/1/2012,AlexNet,https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf,3.1,1,AI and Efficiency, 3 | AlexNet,9/17/2014,GoogLeNet,https://arxiv.org/abs/1409.4842,0.71,4.3,AI and Efficiency, 4 | AlexNet,4/17/2017,MobileNet,https://arxiv.org/abs/1704.04861,0.28,11,AI and Efficiency, 5 | AlexNet,7/3/2017,ShuffleNet (1x),https://arxiv.org/abs/1707.01083,0.15,21,AI and Efficiency, 6 | AlexNet,6/30/2018,ShuffleNet v2 (1x),https://arxiv.org/abs/1807.11164,0.12,25,AI and Efficiency, 7 | AlexNet,5/28/2019,EfficientNet (b0),https://arxiv.org/abs/1905.11946,0.069,44,EfficientNet,https://arxiv.org/abs/1905.11946 8 | ResNet-50,1/10/2015,ResNet-50,https://arxiv.org/abs/1512.03385,17,1,AI and Efficiency, 9 | ResNet-50,5/28/2019,EfficientNet (b1),https://arxiv.org/abs/1905.11946,0.75,10,EfficientNet,https://arxiv.org/abs/1905.11946 10 | Seq2Seq,1/10/2014,Seq2Seq (Ensemble),https://arxiv.org/abs/1409.3215,465,1,AI and Compute,https://openai.com/blog/ai-and-compute/ 11 | Seq2Seq,1/12/2017,Transformer (Base),https://arxiv.org/abs/1706.03762,8,61,Attention is all you need,https://arxiv.org/abs/1807.11164 12 | GNMT,1/26/2016,GNMT,https://arxiv.org/abs/1609.08144,1620,1,Attention is all you need,https://arxiv.org/abs/1807.11164 13 | GNMT,1/12/2017,Transformer (Big),https://arxiv.org/abs/1706.03762,181,9,Attention is all you need,https://arxiv.org/abs/1807.11164 --------------------------------------------------------------------------------