└── README.md


/README.md:
--------------------------------------------------------------------------------
  1 | # Awesome-model-compression-and-acceleration
  2 | 
  3 | Some papers I collected and deemed to be great to read, which is also what I'm about to read, raise a PR or issue if you have any suggestion regarding the list, Thank you.
  4 | 
  5 | 
  6 | ### Survey
  7 | 
  8 | 1. [A Survey of Model Compression and Acceleration for Deep Neural Networks](https://arxiv.org/abs/1710.09282) [arXiv '17]
  9 | 2. [Recent Advances in Efficient Computation of Deep Convolutional Neural Networks](https://arxiv.org/abs/1802.00939) [arXiv '18]
 10 | 3. [Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better](https://arxiv.org/abs/2106.08962)
 11 | 
 12 | ### Model and structure
 13 | 
 14 | 1. [MobilenetV2: Inverted Residuals and Linear Bottlenecks: Mobile Networks for
 15 | Classification, Detection and Segmentation](https://arxiv.org/pdf/1801.04381.pdf) [arXiv '18, Google]
 16 | 1. [NasNet: Learning Transferable Architectures for Scalable Image Recognition](https://arxiv.org/pdf/1707.07012.pdf) [arXiv '17, Google]
 17 | 1. [DeepRebirth: Accelerating Deep Neural Network Execution on Mobile Devices](https://arxiv.org/abs/1708.04728) [AAAI'18, Samsung]
 18 | 1. [ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices](https://arxiv.org/abs/1707.01083) [arXiv '17, Megvii]
 19 | 1. [MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications](https://arxiv.org/abs/1704.04861) [arXiv '17, Google]
 20 | 1. [CondenseNet: An Efficient DenseNet using Learned Group Convolutions](https://arxiv.org/abs/1711.09224) [arXiv '17]
 21 | 1. [Fast YOLO: A Fast You Only Look Once System for Real-time Embedded Object Detection in Video](https://arxiv.org/abs/1709.05943)[arxiv'17]
 22 | 1. [Shift-based Primitives for Efficient Convolutional Neural Networks](https://arxiv.org/pdf/1809.08458) [WACV'18]
 23 | 
 24 | 
 25 | ### Quantization
 26 | 
 27 | 1. [The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning](https://arxiv.org/abs/1611.05402) [ICML'17]
 28 | 1. [Compressing Deep Convolutional Networks using Vector Quantization](https://arxiv.org/abs/1412.6115) [arXiv'14]
 29 | 1. [Quantized Convolutional Neural Networks for Mobile Devices](https://arxiv.org/abs/1512.06473) [CVPR '16]
 30 | 1. [Fixed-Point Performance Analysis of Recurrent Neural Networks](https://arxiv.org/abs/1512.01322) [ICASSP'16]
 31 | 1. [Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations](https://arxiv.org/abs/1609.07061) [arXiv'16]
 32 | 1. [Loss-aware Binarization of Deep Networks](https://arxiv.org/abs/1611.01600) [ICLR'17]
 33 | 1. [Towards the Limit of Network Quantization](https://arxiv.org/abs/1612.01543) [ICLR'17]
 34 | 1. [Deep Learning with Low Precision by Half-wave Gaussian Quantization](https://arxiv.org/abs/1702.00953) [CVPR'17]
 35 | 1. [ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks](https://arxiv.org/abs/1706.02393) [arXiv'17]
 36 | 1. [Training and Inference with Integers in Deep Neural Networks](https://openreview.net/forum?id=HJGXzmspb) [ICLR'18]
 37 | 1. [Deep Learning with Limited Numerical Precision](https://arxiv.org/abs/1502.02551)[ICML'2015]
 38 | 
 39 | 
 40 | ### Pruning
 41 | 
 42 | 1. [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626) [NIPS'15]
 43 | 2. [Pruning Filters for Efficient ConvNets](https://arxiv.org/abs/1608.08710) [ICLR'17]
 44 | 3. [Pruning Convolutional Neural Networks for Resource Efficient Inference](https://arxiv.org/abs/1611.06440) [ICLR'17]
 45 | 4. [Soft Weight-Sharing for Neural Network Compression](https://arxiv.org/abs/1702.04008) [ICLR'17]
 46 | 5. [Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding](https://arxiv.org/abs/1510.00149) [ICLR'16]
 47 | 6. [Dynamic Network Surgery for Efficient DNNs](https://arxiv.org/abs/1608.04493) [NIPS'16]
 48 | 7. [Designing Energy-Efficient Convolutional Neural Networks using Energy-Aware Pruning](https://arxiv.org/abs/1611.05128) [CVPR'17]
 49 | 8. [ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression](https://arxiv.org/abs/1707.06342) [ICCV'17]
 50 | 9. [To prune, or not to prune: exploring the efficacy of pruning for model compression](https://arxiv.org/abs/1710.01878) [ICLR'18]
 51 | 1. [Data-Driven Sparse Structure Selection for Deep Neural Networks](https://arxiv.org/pdf/1707.01213.pdf)
 52 | 2. [Learning Structured Sparsity in Deep Neural Networks](https://arxiv.org/pdf/1608.03665.pdf)
 53 | 3. [Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism](http://www-personal.umich.edu/~jiecaoyu/papers/jiecaoyu-isca17.pdf)
 54 | 4. [Learning to Prune: Exploring the Frontier of Fast and Accurate Parsing](http://www.cs.jhu.edu/~jason/papers/vieira+eisner.tacl17.pdf)
 55 | 5. [Channel pruning for accelerating very deep neural networks](http://openaccess.thecvf.com/content_ICCV_2017/papers/He_Channel_Pruning_for_ICCV_2017_paper.pdf) [ICCV'17]
 56 | 6. [Amc: Automl for model compression and acceleration on mobile devices](http://openaccess.thecvf.com/content_ECCV_2018/papers/Yihui_He_AMC_Automated_Model_ECCV_2018_paper.pdf) [ECCV'18]
 57 | 7. [RePr: Improved Training of Convolutional Filters](https://arxiv.org/pdf/1811.07275.pdf) [arXiv'18]
 58 | 
 59 | ### Binarized neural network
 60 | 
 61 | 1. [Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1](https://arxiv.org/pdf/1602.02830.pdf)
 62 | 2. [XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks](https://arxiv.org/pdf/1603.05279.pdf)
 63 | 3. [Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration](https://arxiv.org/pdf/1707.04693.pdf)
 64 | 
 65 | ### Low-rank Approximation
 66 | 
 67 | 1. [Efficient and Accurate Approximations of Nonlinear Convolutional Networks](https://arxiv.org/abs/1411.4229) [CVPR'15]
 68 | 2. [Accelerating Very Deep Convolutional Networks for Classification and Detection](https://arxiv.org/abs/1505.06798) (Extended version of above one)
 69 | 3. [Convolutional neural networks with low-rank regularization](https://arxiv.org/abs/1511.06067) [arXiv'15]
 70 | 4. [Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation](https://arxiv.org/abs/1404.0736) [NIPS'14]
 71 | 5. [Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications](https://arxiv.org/abs/1511.06530) [ICLR'16]
 72 | 6. [High performance ultra-low-precision convolutions on mobile devices](https://arxiv.org/abs/1712.02427) [NIPS'17]
 73 | 7. [Speeding up convolutional neural networks with low rank expansions](http://www.robots.ox.ac.uk/~vgg/publications/2014/Jaderberg14b/jaderberg14b.pdf)
 74 | 8. [Tensor Yard: One-Shot Algorithm of Hardware-Friendly Tensor-Train Decomposition for Convolutional Neural Networks](https://arxiv.org/abs/2108.04029) [arXiv'21]
 75 | 
 76 | ### Distilling
 77 | 
 78 | 1. [Dark knowledge](http://www.ttic.edu/dl/dark14.pdf)
 79 | 2. [FitNets: Hints for Thin Deep Nets](https://arxiv.org/pdf/1412.6550.pdf)
 80 | 3. [Net2net: Accelerating learning via knowledge transfer]()
 81 | 4. [Distilling the Knowledge in a Neural Network](https://arxiv.org/abs/1503.02531)
 82 | 5. [MobileID: Face Model Compression by Distilling Knowledge from Neurons](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/11977)
 83 | 6. [DarkRank: Accelerating Deep Metric Learning via Cross Sample Similarities Transfer](https://arxiv.org/pdf/1707.01220.pdf)
 84 | 7. [Deep Model Compression: Distilling Knowledge from Noisy Teachers](https://arxiv.org/pdf/1610.09650.pdf)
 85 | 8. [Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer](https://arxiv.org/pdf/1612.03928.pdf)
 86 | 9. [Sequence-Level Knowledge Distillation](https://arxiv.org/pdf/1606.07947.pdf)
 87 | 1. [Like What You Like: Knowledge Distill via Neuron Selectivity Transfer](https://arxiv.org/pdf/1707.01219.pdf)
 88 | 2. [Learning Efficient Object Detection Models with Knowledge Distillation](http://papers.nips.cc/paper/6676-learning-efficient-object-detection-models-with-knowledge-distillation.pdf)
 89 | 3. [Data-Free Knowledge Distillation For Deep Neural Networks](https://arxiv.org/pdf/1710.07535.pdf)
 90 | 4. [Learning Loss for Knowledge Distillation with Conditional Adversarial Networks](https://arxiv.org/pdf/1709.00513.pdf)
 91 | 5. [Knowledge Projection for Effective Design of Thinner and Faster Deep Neural Networks](https://arxiv.org/pdf/1710.09505.pdf)
 92 | 6. [Moonshine: Distilling with Cheap Convolutions](https://arxiv.org/pdf/1711.02613.pdf)
 93 | 7. [Model Distillation with Knowledge Transfer from Face Classification to Alignment and Verification](https://arxiv.org/pdf/1709.02929.pdf)
 94 | 
 95 | ### System
 96 | 
 97 | 1. [DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications](https://www.sigmobile.org/mobisys/2017/accepted.php) [MobiSys '17]=
 98 | 2. [DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware](http://fahim-kawsar.net/papers/Mathur.MobiSys2017-Camera.pdf) [MobiSys '17]
 99 | 3. [MobiRNN: Efficient Recurrent Neural Network Execution on Mobile GPU](https://arxiv.org/abs/1706.00878) [EMDL '17]
100 | 4. [DeepSense: A GPU-based deep convolutional neural network framework on commodity mobile devices](http://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=4278&context=sis_research) [WearSys '16]
101 | 5. [DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices](http://niclane.org/pubs/deepx_ipsn.pdf) [IPSN '16]
102 | 6. [EIE: Efficient Inference Engine on Compressed Deep Neural Network](https://arxiv.org/abs/1602.01528) [ISCA '16]
103 | 7. [MCDNN: An Approximation-Based Execution Framework for Deep Stream Processing Under Resource Constraints](http://haneul.github.io/papers/mcdnn.pdf) [MobiSys '16]
104 | 8. [DXTK: Enabling Resource-efficient Deep Learning on Mobile and Embedded Devices with the DeepX Toolkit](http://niclane.org/pubs/dxtk_mobicase.pdf) [MobiCASE '16]
105 | 9. [Sparsification and Separation of Deep Learning Layers for Constrained Resource Inference on Wearables](http://niclane.org/pubs/sparsesep_sensys.pdf) [SenSys ’16]
106 | 1. [An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices](http://niclane.org/pubs/iotapp15_early.pdf) [IoT-App ’15]
107 | 2. [CNNdroid: GPU-Accelerated Execution of Trained Deep Convolutional Neural Networks on Android](https://arxiv.org/abs/1511.07376) [MM '16]
108 | 3. [fpgaConvNet: A Toolflow for Mapping Diverse Convolutional Neural Networks on Embedded FPGAs](https://arxiv.org/abs/1711.08740) [NIPS '17]
109 | 
110 | 
111 | ### Some optimization techniques
112 | 
113 | 1. 消灭重复计算
114 | 2. 展开循环
115 | 3. 利用SIMD指令
116 | 4. OpenMP
117 | 5. 定点化
118 | 6. 避免非连续内存读写
119 | 
120 | ### References
121 | 
122 | 
123 | - [Reading List](http://slazebni.cs.illinois.edu/spring17/reading_lists.html)
124 | - [Reading List 2](https://github.com/jiecaoyu/reading_list)
125 | - [Reading List 3](http://slazebni.cs.illinois.edu/spring17/cs598_topics.pdf)
126 | - [Reading List 4](https://github.com/csarron/emdl)
127 | - [Reading List 5](https://github.com/sun254/awesome-model-compression-and-acceleration)
128 | * [纵览轻量化卷积神经网络：SqueezeNet、MobileNet、ShuffleNet、Xception](https://www.jiqizhixin.com/articles/2018-01-08-6)  
129 | * [An Introduction to different Types of Convolutions in Deep Learning](https://towardsdatascience.com/types-of-convolutions-in-deep-learning-717013397f4d)  
130 | * [CNN中千奇百怪的卷积方式大汇总](https://zhuanlan.zhihu.com/p/29367273)
131 | 
132 | 
133 | 
134 | 


--------------------------------------------------------------------------------