├── comparative_results.pdf
└── README.md


/comparative_results.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/juliagusak/model-compression-and-acceleration-progress/HEAD/comparative_results.pdf


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Model Compression and Acceleration Progress
  2 | Repository to track the progress in model compression and acceleration
  3 | 
  4 | ## Low-rank approximation
  5 | - T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) 
  6 | [paper](http://openaccess.thecvf.com/content_CVPR_2019/papers/Kossaifi_T-Net_Parametrizing_Fully_Convolutional_Nets_With_a_Single_High-Order_Tensor_CVPR_2019_paper.pdf)
  7 | - MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019)
  8 | [paper](http://openaccess.thecvf.com/content_ICCVW_2019/papers/LPCV/Gusak_Automated_Multi-Stage_Compression_of_Neural_Networks_ICCVW_2019_paper.pdf) | [code (PyTorch)](https://github.com/juliagusak/musco)
  9 | - Efficient Neural Network Compression (CVPR 2019)
 10 | [paper](https://arxiv.org/abs/1811.12781) | [code (Caffe)](https://github.com/Hyeji-Kim/ENC) 
 11 | - Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019)
 12 | [paper](https://openreview.net/pdf?id=B1eHgu-Fim) | [code (PyTorch)](https://github.com/zuenko/ALRF)
 13 | - Extreme Network Compression via Filter Group Approximation (ECCV 2018)
 14 | [paper](https://arxiv.org/abs/1807.11254)
 15 | - Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop)
 16 | [paper](https://arxiv.org/abs/1611.03214) | [code (TensorFlow)](https://github.com/timgaripov/TensorNet-TF) | [code (MATLAB, Theano + Lasagne)](https://github.com/Bihaqo/TensorNet)
 17 | - Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016)
 18 | [paper](https://arxiv.org/abs/1511.06530) 
 19 | - Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016)
 20 | [paper](https://arxiv.org/abs/1505.06798)
 21 | - Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015)
 22 | [paper](https://arxiv.org/abs/1412.6553) | [code (Caffe)](https://github.com/vadim-v-lebedev/cp-decomposition)
 23 | - Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014)
 24 | [paper](https://arxiv.org/abs/1404.0736)
 25 | - Speeding up Convolutional Neural Networks with Low Rank Expansions (2014)
 26 | [paper](https://arxiv.org/abs/1405.3866)
 27 | 
 28 | 
 29 | ## Pruning & Sparsification
 30 | #### Papers
 31 | - Rethinking the Value of Network Pruning (ICLR 2019, NIPS 2018 workshop) 
 32 | [paper](https://arxiv.org/abs/1810.05270) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning)
 33 | - Dynamic Channel Pruning: Feature Boosting and Suppression (ICLR 2019)
 34 | [paper](https://arxiv.org/abs/1810.05331) | [code](https://github.com/deep-fry/mayo)
 35 | - AutoPruner: An End-to-End Trainable Filter Pruning Method for Efficient Deep Model Inference (2019)
 36 | [paper](https://arxiv.org/abs/1805.08941)
 37 | - CLIP-Q: Deep Network Compression Learning by In-ParallelPruning-Quantization (CVPR 2018)
 38 | [paper](http://www.sfu.ca/~ftung/papers/clipq_cvpr18.pdf)
 39 | - Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks (IJCAI 2018)
 40 | [paper](https://arxiv.org/abs/1808.06866) | [code and models (PyTorch)](https://github.com/he-y/soft-filter-pruning)
 41 | - Discrimination-aware Channel Pruning for Deep Neural Networks (NIPS 2018)
 42 | [paper](https://papers.nips.cc/paper/7367-discrimination-aware-channel-pruning-for-deep-neural-networks.pdf) | [code and pretrained models (PyTorch)](https://github.com/SCUT-AILab/DCP)
 43 | - AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV18)
 44 | [paper](https://arxiv.org/abs/1802.03494) | [code (PyTorch)](https://github.com/mit-han-lab/amc-release) | [pretrained models (PyTorch, TensorFlow, TensorFlow Light)](https://github.com/mit-han-lab/amc-compressed-models)
 45 | - Channel Gating Neural Networks (2018)
 46 | [paper](https://arxiv.org/abs/1805.12549)
 47 | - DSD: Dense-Sparse-Dense Training for Deep Neural Networks [paper](https://arxiv.org/abs/1607.04381) | [pretrained models (Caffe)](https://songhan.github.io/DSD/) (ICLR 2017)
 48 | - Channel Pruning for Accelerating Very Deep Neural Networks (ICCV 2017)
 49 | [paper](https://arxiv.org/abs/1707.06168) | [code and pretrained models (Caffe)](https://github.com/yihui-he/channel-pruning) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet)
 50 | - Learning Efficient Convolutional Networks through Network Slimming (ICCV 2017)
 51 | [paper](https://arxiv.org/abs/1708.06519) | [code (Torch, Pytorch)](https://github.com/Eric-mingjie/network-slimming)
 52 | - ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression (ICCV 2017)
 53 | [paper](https://arxiv.org/abs/1707.06342) | [pretrained model (Caffe)](https://github.com/Roll920/ThiNet) | [code (PyTorch)](https://github.com/Eric-mingjie/rethinking-network-pruning/tree/master/imagenet)
 54 | - Structured Bayesian Pruning via Log-Normal Multiplicative Noise (NIPS 2017)
 55 | [paper](https://papers.nips.cc/paper/7254-structured-bayesian-pruning-via-log-normal-multiplicative-noise.pdf) | [code (TensorFlow, Theano + Lasagne)](https://github.com/necludov/group-sparsity-sbp)
 56 | - SphereFace: Deep Hypersphere Embedding for Face Recognition (CVPR 2017)
 57 | [paper](https://arxiv.org/abs/1704.08063) | [code and pretrained models (Caffe)](https://github.com/isthatyoung/Sphereface-prune) 
 58 | - Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (ICLR 2016)
 59 | [paper](https://arxiv.org/abs/1510.00149)
 60 | - Fast ConvNets Using Group-wise Brain Damage (CVPR 2016)
 61 | [paper](http://openaccess.thecvf.com/content_cvpr_2016/papers/Lebedev_Fast_ConvNets_Using_CVPR_2016_paper.pdf)
 62 | 
 63 | #### Repos
 64 | - Pruning + quantization [code and pretrained models (TensorFlow, TensorFlow light)](https://github.com/vikranth94/Model-Compression). Examples for CIFAR.
 65 | 
 66 | 
 67 | ## Knowledge distillation 
 68 | #### Papers
 69 | - Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) [paper](https://arxiv.org/abs/1901.00366) | [code (Caffe)](https://github.com/Tangshitao/Semi-supervised-Adaptive-Distillation)
 70 | - Model compression via distillation and quantization (ICLR 2018) [paper](https://arxiv.org/abs/1802.05668) | [code (Pytorch)](https://github.com/antspy/quantized_distillation)
 71 | - Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop)
 72 | [paper](https://arxiv.org/abs/1709.00513)
 73 | - Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018)
 74 | [paper](https://arxiv.org/abs/1709.00513)
 75 | - Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016)
 76 | [paper](https://arxiv.org/abs/1511.05641)
 77 | - Distilling the Knowledge in a Neural Network (NIPS 2014)
 78 | [paper](https://arxiv.org/abs/1503.02531)
 79 | - FitNets: Hints for Thin Deep Nets (2014)
 80 | [paper](https://arxiv.org/abs/1412.6550) | [code (Theano + Pylearn2)](https://github.com/adri-romsor/FitNets)
 81 | 
 82 | #### Repos
 83 | TensorFlow implementation of three papers https://github.com/chengshengchan/model_compression, results for CIFAR-10
 84 | 
 85 | ## Quantization
 86 | - Bayesian Bits: Unifying Quantization and Pruning (2020) [paper](https://arxiv.org/abs/2005.07093)
 87 | - Up or Down? Adaptive Rounding for Post-Training Quantization (2020) [paper](https://arxiv.org/abs/2004.10568)
 88 | - Gradient $\ell_1$ Regularization for Quantization Robustness (ICLR 2020) [paper](https://arxiv.org/abs/2002.07520)
 89 | - Training Binary Neural Networks with Real-to-Binary Convolutions (ICLR 2020)
 90 | [paper](https://arxiv.org/abs/2003.11535) | [code (coming soon)](https://github.com/brais-martinez/real2binary)
 91 | - Data-Free Quantization Through Weight Equalization and Bias Correction (ICCV 2019) [paper](https://arxiv.org/abs/1906.04721) | [code (PyTorch)](https://github.com/jakc4103/DFQ)
 92 | - XNOR-Net++ (2019)
 93 | [paper](https://arxiv.org/abs/1909.13863)
 94 | - Matrix and tensor decompositions for training binary neural networks (2019)
 95 | [paper](https://arxiv.org/pdf/1904.07852.pdf)
 96 | - XNOR-Net (ECCV 2016)
 97 | [paper](https://arxiv.org/abs/1603.05279) | [code (Pytorch)](https://github.com/jiecaoyu/XNOR-Net-PyTorch)
 98 | - Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks (2019) [paper](https://arxiv.org/abs/1903.08066) | [code (TensorFlow)](https://github.com/Xilinx/graffitist)
 99 | - Relaxed Quantization for Discretized Neural Networks (ICLR 2019) [paper](https://arxiv.org/abs/1810.01875)
100 | - Training and Inference with Integers in Deep Neural Networks (ICLR 2018) [paper](https://arxiv.org/abs/1802.04680) | [code (TensorFlow)](https://github.com/boluoweifenda/WAGE)
101 | - Training Quantized Nets: A Deeper Understanding (NeurIPS 2017) [paper](https://arxiv.org/abs/1706.02379) 
102 | - Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference (2017) [paper](https://arxiv.org/abs/1712.05877)
103 | - Deep Learning with Limited Numerical Precision (2015) [paper](https://arxiv.org/abs/1502.02551)
104 | - Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) [paper](https://arxiv.org/abs/1308.3432)
105 | 
106 | 
107 | 
108 | 
109 | ## Architecture search
110 | - MobileNets
111 |   - Searching for MobileNetV3
112 |   [paper](https://arxiv.org/abs/1905.02244)
113 |   - MobileNetV2: Inverted Residuals and Linear Bottlenecks (CVPR 2018)
114 |   [paper](https://arxiv.org/abs/1801.04381) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/models/tree/master/research/slim/nets/mobilenet)
115 | - EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019)
116 | [paper](https://arxiv.org/abs/1905.11946) | [code and pretrained models (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet)
117 | - MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019)
118 | [paper](https://arxiv.org/abs/1807.11626) | [code (TensorFlow)](https://github.com/tensorflow/tpu/tree/master/models/official/mnasnet)
119 | - MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) 
120 | [paper](https://arxiv.org/abs/1711.06798) | [code (TensorFlow)](https://github.com/google-research/morph-net)
121 | - ShuffleNets
122 |   - ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018)
123 |   [paper](https://arxiv.org/abs/1807.11164)
124 |   - ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018)
125 |   [paper](https://arxiv.org/abs/1707.01083)
126 | - Multi-Fiber Networks for Video Recognition (ECCV 2018)
127 | [paper](https://arxiv.org/abs/1807.11195) | [code (PyTorch)](https://github.com/cypw/PyTorch-MFNet)
128 | - IGCVs
129 |   - IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018)
130 |   [paper](https://arxiv.org/abs/1806.00178) | [code and pretrained models (MXNet)](https://github.com/homles11/IGCV3)
131 |   - IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018)
132 |   [paper](https://arxiv.org/abs/1804.06202)
133 |   - Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017)
134 |   [paper](https://arxiv.org/abs/1707.02725)
135 | 
136 | 
137 | ## PhD thesis and overviews
138 | 
139 | - Quantizing deep convolutional networks for efficient inference: A whitepaper (2018) [paper](https://arxiv.org/abs/1806.08342)
140 | - Algorithms for speeding up convolutional neural networks (2018) [thesis](https://www.skoltech.ru/app/data/uploads/2018/10/Thesis-Final.pdf)
141 | - Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) [paper](http://cwww.ee.nctu.edu.tw/~cfung/docs/learning/cheng2018DNN_model_compression_accel.pdf)
142 | - Efficient methods and hardware for deep learning (2017) [thesis](https://stacks.stanford.edu/file/druid:qf934gh3708/EFFICIENT%20METHODS%20AND%20HARDWARE%20FOR%20DEEP%20LEARNING-augmented.pdf)
143 | 
144 | 
145 | ## Frameworks
146 | - [MUSCO](https://github.com/musco-ai) - framework for model compression using tensor decompositions (PyTorch, TensorFlow)
147 | - [AIMET](https://github.com/quic/aimet) - AI Model Efficiency Toolkit (PyTorch, Tensorflow)
148 | - [Distiller](https://github.com/NervanaSystems/distiller) - package for compression using pruning and low-precision arithmetic (PyTorch)
149 | - [MorphNet](https://github.com/google-research/morph-net) - framework for neural networks architecture learning (TensorFlow)
150 | - [Mayo](https://github.com/deep-fry/mayo) - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods 
151 | - [PocketFlow](https://github.com/Tencent/PocketFlow) - framework for model pruning, sparcification, quantization (TensorFlow implementation) 
152 | - [Keras compressor](https://github.com/DwangoMediaVillage/keras_compressor) - compression using low-rank approximations, SVD for matrices, Tucker for tensors.
153 | - [Caffe compressor](https://github.com/yuanyuanli85/CaffeModelCompression) K-means based quantization
154 | - [gemmlowp](https://github.com/google/gemmlowp/blob/master/doc/quantization.md#implementation-of-quantized-matrix-multiplication) - Building a quantization paradigm from first principles (C++)
155 | - [NNI](https://github.com/microsoft/nni) - Framework for Feature Engineering, NAS, Hyperparam tuning and Model compression 
156 | 
157 | 
158 | 
159 | ## Comparison of different approaches
160 | 
161 | Please, see ```comparative_results.pdf``` 
162 | 
163 | 
164 | #### 
165 | 
166 | ## Similar repos
167 | 
168 | - https://github.com/ZhishengWang/Embedded-Neural-Network
169 | - https://github.com/memoiry/Awesome-model-compression-and-acceleration
170 | - https://github.com/sun254/awesome-model-compression-and-acceleration
171 | - https://github.com/guan-yuan/awesome-AutoML-and-Lightweight-Models
172 | - https://github.com/chester256/Model-Compression-Papers
173 | - https://github.com/mapleam/model-compression-and-acceleration-4-DNN
174 | - https://github.com/cedrickchee/awesome-ml-model-compression
175 | - https://github.com/jnjaby/Model-Compression-Acceleration
176 | - https://github.com/he-y/Awesome-Pruning
177 | 


--------------------------------------------------------------------------------