├── comparative_results.pdf └── README.md /comparative_results.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/juliagusak/model-compression-and-acceleration-progress/HEAD/comparative_results.pdf -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Model Compression and Acceleration Progress 2 | Repository to track the progress in model compression and acceleration 3 | 4 | ## Low-rank approximation 5 | - T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) 6 | [paper](http://openaccess.thecvf.com/content_CVPR_2019/papers/Kossaifi_T-Net_Parametrizing_Fully_Convolutional_Nets_With_a_Single_High-Order_Tensor_CVPR_2019_paper.pdf) 7 | - MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019) 8 | [paper](http://openaccess.thecvf.com/content_ICCVW_2019/papers/LPCV/Gusak_Automated_Multi-Stage_Compression_of_Neural_Networks_ICCVW_2019_paper.pdf) | [code (PyTorch)](https://github.com/juliagusak/musco) 9 | - Efficient Neural Network Compression (CVPR 2019) 10 | [paper](https://arxiv.org/abs/1811.12781) | [code (Caffe)](https://github.com/Hyeji-Kim/ENC) 11 | - Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019) 12 | [paper](https://openreview.net/pdf?id=B1eHgu-Fim) | [code (PyTorch)](https://github.com/zuenko/ALRF) 13 | - Extreme Network Compression via Filter Group Approximation (ECCV 2018) 14 | [paper](https://arxiv.org/abs/1807.11254) 15 | - Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop) 16 | [paper](https://arxiv.org/abs/1611.03214) | [code (TensorFlow)](https://github.com/timgaripov/TensorNet-TF) | [code (MATLAB, Theano + Lasagne)](https://github.com/Bihaqo/TensorNet) 17 | - Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016) 18 | [paper](https://arxiv.org/abs/1511.06530) 19 | - Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016) 20 | [paper](https://arxiv.org/abs/1505.06798) 21 | - Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015) 22 | [paper](https://arxiv.org/abs/1412.6553) | [code (Caffe)](https://github.com/vadim-v-lebedev/cp-decomposition) 23 | - Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014) 24 | [paper](https://arxiv.org/abs/1404.0736) 25 | - Speeding up Convolutional Neural Networks with Low Rank Expansions (2014) 26 | [paper](https://arxiv.org/abs/1405.3866) 27 | 28 | 29 | ## Pruning & Sparsification 30 | #### Papers 31 | - Rethinking the Value of Network Pruning (ICLR 2019, NIPS 2018 workshop) 32 | [paper](https://arxiv.org/abs/1810.05270) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning) 33 | - Dynamic Channel Pruning: Feature Boosting and Suppression (ICLR 2019) 34 | [paper](https://arxiv.org/abs/1810.05331) | [code](https://github.com/deep-fry/mayo) 35 | - AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference (2019) 36 | [paper](https://arxiv.org/abs/1805.08941) 37 | - CLIP-Q: Deep Network Compression Learning by In-ParallelPruning-Quantization (CVPR 2018) 38 | [paper](http://www.sfu.ca/~ftung/papers/clipq_cvpr18.pdf) 39 | - Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks (IJCAI 2018) 40 | [paper](https://arxiv.org/abs/1808.06866) | [code and models (PyTorch)](https://github.com/he-y/soft-filter-pruning) 41 | - Discrimination-aware Channel Pruning for Deep Neural Networks (NIPS 2018) 42 | [paper](https://papers.nips.cc/paper/7367-discrimination-aware-channel-pruning-for-deep-neural-networks.pdf) | [code and pretrained models (PyTorch)](https://github.com/SCUT-AILab/DCP) 43 | - AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV18) 44 | [paper](https://arxiv.org/abs/1802.03494) | [code (PyTorch)](https://github.com/mit-han-lab/amc-release) | [pretrained models (PyTorch, TensorFlow, TensorFlow Light)](https://github.com/mit-han-lab/amc-compressed-models) 45 | - Channel Gating Neural Networks (2018) 46 | [paper](https://arxiv.org/abs/1805.12549) 47 | - DSD: Dense-Sparse-Dense Training for Deep Neural Networks [paper](https://arxiv.org/abs/1607.04381) | [pretrained models (Caffe)](https://songhan.github.io/DSD/) (ICLR 2017) 48 | - Channel Pruning for Accelerating Very Deep Neural Networks (ICCV 2017) 49 | [paper](https://arxiv.org/abs/1707.06168) | [code and pretrained models (Caffe)](https://github.com/yihui-he/channel-pruning) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet) 50 | - Learning Efficient Convolutional Networks through Network Slimming (ICCV 2017) 51 | [paper](https://arxiv.org/abs/1708.06519) | [code (Torch, Pytorch)](https://github.com/Eric-mingjie/network-slimming) 52 | - ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (ICCV 2017) 53 | [paper](https://arxiv.org/abs/1707.06342) | [pretrained model (Caffe)](https://github.com/Roll920/ThiNet) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet) 54 | - Structured Bayesian Pruning via Log-Normal Multiplicative Noise (NIPS 2017) 55 | [paper](https://papers.nips.cc/paper/7254-structured-bayesian-pruning-via-log-normal-multiplicative-noise.pdf) | [code (TensorFlow, Theano + Lasagne)](https://github.com/necludov/group-sparsity-sbp) 56 | - SphereFace: Deep Hypersphere Embedding for Face Recognition (CVPR 2017) 57 | [paper](https://arxiv.org/abs/1704.08063) | [code and pretrained models (Caffe)](https://github.com/isthatyoung/Sphereface-prune) 58 | - Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (ICLR 2016) 59 | [paper](https://arxiv.org/abs/1510.00149) 60 | - Fast ConvNets Using Group-wise Brain Damage (CVPR 2016) 61 | [paper](http://openaccess.thecvf.com/content_cvpr_2016/papers/Lebedev_Fast_ConvNets_Using_CVPR_2016_paper.pdf) 62 | 63 | #### Repos 64 | - Pruning + quantization [code and pretrained models (TensorFlow, TensorFlow light)](https://github.com/vikranth94/Model-Compression). Examples for CIFAR. 65 | 66 | 67 | ## Knowledge distillation 68 | #### Papers 69 | - Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) [paper](https://arxiv.org/abs/1901.00366) | [code (Caffe)](https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation) 70 | - Model compression via distillation and quantization (ICLR 2018) [paper](https://arxiv.org/abs/1802.05668) | [code (Pytorch)](https://github.com/antspy/quantized_distillation) 71 | - Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop) 72 | [paper](https://arxiv.org/abs/1709.00513) 73 | - Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018) 74 | [paper](https://arxiv.org/abs/1709.00513) 75 | - Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016) 76 | [paper](https://arxiv.org/abs/1511.05641) 77 | - Distilling the Knowledge in a Neural Network (NIPS 2014) 78 | [paper](https://arxiv.org/abs/1503.02531) 79 | - FitNets: Hints for Thin Deep Nets (2014) 80 | [paper](https://arxiv.org/abs/1412.6550) | [code (Theano + Pylearn2)](https://github.com/adri-romsor/FitNets) 81 | 82 | #### Repos 83 | TensorFlow implementation of three papers https://github.com/chengshengchan/model_compression, results for CIFAR-10 84 | 85 | ## Quantization 86 | - Bayesian Bits: Unifying Quantization and Pruning (2020) [paper](https://arxiv.org/abs/2005.07093) 87 | - Up or Down? Adaptive Rounding for Post-Training Quantization (2020) [paper](https://arxiv.org/abs/2004.10568) 88 | - Gradient $\ell_1$ Regularization for Quantization Robustness (ICLR 2020) [paper](https://arxiv.org/abs/2002.07520) 89 | - Training Binary Neural Networks with Real-to-Binary Convolutions (ICLR 2020) 90 | [paper](https://arxiv.org/abs/2003.11535) | [code (coming soon)](https://github.com/brais-martinez/real2binary) 91 | - Data-Free Quantization Through Weight Equalization and Bias Correction (ICCV 2019) [paper](https://arxiv.org/abs/1906.04721) | [code (PyTorch)](https://github.com/jakc4103/DFQ) 92 | - XNOR-Net++ (2019) 93 | [paper](https://arxiv.org/abs/1909.13863) 94 | - Matrix and tensor decompositions for training binary neural networks (2019) 95 | [paper](https://arxiv.org/pdf/1904.07852.pdf) 96 | - XNOR-Net (ECCV 2016) 97 | [paper](https://arxiv.org/abs/1603.05279) | [code (Pytorch)](https://github.com/jiecaoyu/XNOR-Net-PyTorch) 98 | - Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks (2019) [paper](https://arxiv.org/abs/1903.08066) | [code (TensorFlow)](https://github.com/Xilinx/graffitist) 99 | - Relaxed Quantization for Discretized Neural Networks (ICLR 2019) [paper](https://arxiv.org/abs/1810.01875) 100 | - Training and Inference with Integers in Deep Neural Networks (ICLR 2018) [paper](https://arxiv.org/abs/1802.04680) | [code (TensorFlow)](https://github.com/boluoweifenda/WAGE) 101 | - Training Quantized Nets: A Deeper Understanding (NeurIPS 2017) [paper](https://arxiv.org/abs/1706.02379) 102 | - Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference (2017) [paper](https://arxiv.org/abs/1712.05877) 103 | - Deep Learning with Limited Numerical Precision (2015) [paper](https://arxiv.org/abs/1502.02551) 104 | - Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) [paper](https://arxiv.org/abs/1308.3432) 105 | 106 | 107 | 108 | 109 | ## Architecture search 110 | - MobileNets 111 | - Searching for MobileNetV3 112 | [paper](https://arxiv.org/abs/1905.02244) 113 | - MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018) 114 | [paper](https://arxiv.org/abs/1801.04381) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet) 115 | - EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019) 116 | [paper](https://arxiv.org/abs/1905.11946) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet) 117 | - MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019) 118 | [paper](https://arxiv.org/abs/1807.11626) | [code (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet) 119 | - MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) 120 | [paper](https://arxiv.org/abs/1711.06798) | [code (TensorFlow)](https://github.com/google-research/morph-net) 121 | - ShuffleNets 122 | - ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018) 123 | [paper](https://arxiv.org/abs/1807.11164) 124 | - ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018) 125 | [paper](https://arxiv.org/abs/1707.01083) 126 | - Multi-Fiber Networks for Video Recognition (ECCV 2018) 127 | [paper](https://arxiv.org/abs/1807.11195) | [code (PyTorch)](https://github.com/cypw/PyTorch-MFNet) 128 | - IGCVs 129 | - IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018) 130 | [paper](https://arxiv.org/abs/1806.00178) | [code and pretrained models (MXNet)](https://github.com/homles11/IGCV3) 131 | - IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018) 132 | [paper](https://arxiv.org/abs/1804.06202) 133 | - Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017) 134 | [paper](https://arxiv.org/abs/1707.02725) 135 | 136 | 137 | ## PhD thesis and overviews 138 | 139 | - Quantizing deep convolutional networks for efficient inference: A whitepaper (2018) [paper](https://arxiv.org/abs/1806.08342) 140 | - Algorithms for speeding up convolutional neural networks (2018) [thesis](https://www.skoltech.ru/app/data/uploads/2018/10/Thesis-Final.pdf) 141 | - Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) [paper](http://cwww.ee.nctu.edu.tw/~cfung/docs/learning/cheng2018DNN_model_compression_accel.pdf) 142 | - Efficient methods and hardware for deep learning (2017) [thesis](https://stacks.stanford.edu/file/druid:qf934gh3708/EFFICIENT%20METHODS%20AND%20HARDWARE%20FOR%20DEEP%20LEARNING-augmented.pdf) 143 | 144 | 145 | ## Frameworks 146 | - [MUSCO](https://github.com/musco-ai) - framework for model compression using tensor decompositions (PyTorch, TensorFlow) 147 | - [AIMET](https://github.com/quic/aimet) - AI Model Efficiency Toolkit (PyTorch, Tensorflow) 148 | - [Distiller](https://github.com/NervanaSystems/distiller) - package for compression using pruning and low-precision arithmetic (PyTorch) 149 | - [MorphNet](https://github.com/google-research/morph-net) - framework for neural networks architecture learning (TensorFlow) 150 | - [Mayo](https://github.com/deep-fry/mayo) - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods 151 | - [PocketFlow](https://github.com/Tencent/PocketFlow) - framework for model pruning, sparcification, quantization (TensorFlow implementation) 152 | - [Keras compressor](https://github.com/DwangoMediaVillage/keras_compressor) - compression using low-rank approximations, SVD for matrices, Tucker for tensors. 153 | - [Caffe compressor](https://github.com/yuanyuanli85/CaffeModelCompression) K-means based quantization 154 | - [gemmlowp](https://github.com/google/gemmlowp/blob/master/doc/quantization.md#implementation-of-quantized-matrix-multiplication) - Building a quantization paradigm from first principles (C++) 155 | - [NNI](https://github.com/microsoft/nni) - Framework for Feature Engineering, NAS, Hyperparam tuning and Model compression 156 | 157 | 158 | 159 | ## Comparison of different approaches 160 | 161 | Please, see ```comparative_results.pdf``` 162 | 163 | 164 | #### 165 | 166 | ## Similar repos 167 | 168 | - https://github.com/ZhishengWang/Embedded-Neural-Network 169 | - https://github.com/memoiry/Awesome-model-compression-and-acceleration 170 | - https://github.com/sun254/awesome-model-compression-and-acceleration 171 | - https://github.com/guan-yuan/awesome-AutoML-and-Lightweight-Models 172 | - https://github.com/chester256/Model-Compression-Papers 173 | - https://github.com/mapleam/model-compression-and-acceleration-4-DNN 174 | - https://github.com/cedrickchee/awesome-ml-model-compression 175 | - https://github.com/jnjaby/Model-Compression-Acceleration 176 | - https://github.com/he-y/Awesome-Pruning 177 | --------------------------------------------------------------------------------