└── README.md


/README.md:
--------------------------------------------------------------------------------
 1 | # Deep-Learning-Hardware-Accelerator
 2 | A collection of works for hardware accelerators in deep learning.
 3 | 
 4 | ## Conference Paper
 5 | ### 2015
 6 | * Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks (FPGA 2015)
 7 | 
 8 | ### 2016
 9 | * DnnWeaver: From High-Level Deep Network Models to FPGA Acceleration (MICRO 2016)
10 | * Fused-layer CNN accelerators (MICRO 2016)
11 | * Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks (ICCAD 2016)
12 | * Going deeper with embedded fpga platform for convolutional neural network (FPGA 2016)
13 | * Automatic code generation of convolutional neural networks in FPGA implementation (FPT 2016)
14 | * Angel-Eye: A Complete Design Flow for Mapping CNN onto Customized Hardware (ISVLSI 2016)
15 | * A high performance FPGA-based accelerator for large-scale convolutional neural networks (FPL 2016)
16 | * Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks (ISCA 2016)
17 | * C-brain: A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization (DAC 2016)
18 | * Stripes: Bit-serial deep neural network computing (MICRO 2016)
19 | * Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks (ASP-DAC 2016)
20 | 
21 | ### 2017
22 | * Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks (FPGA 2017)
23 | * Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs (DAC 2017)
24 | * A pipelined and scalable dataflow implementation of convolutional neural networks on FPGA (IPDPSW 2017)
25 | * A multistage dataflow implementation of a Deep Convolutional Neural Network based on FPGA for high-speed object recognition (SSIAI 2017)
26 | * Maximizing CNN accelerator efficiency through resource partitioning (ISCA 2017)
27 | * Design space exploration of FPGA accelerators for convolutional neural networks (DATE 2017)
28 | * Work-in-progress: a power-efficient and high performance FPGA accelerator for convolutional neural networks (CODES+ISSS 2017)
29 | * A Power-Efficient Accelerator for Convolutional Neural Networks (CLUSTER 2017)
30 | * In-Datacenter Performance Analysis of a Tensor Processing Unit (ISCA 2017)
31 | * FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks (HPCA 2017)
32 | * COSY: An Energy-Efficient Hardware Architecture for Deep Convolutional Neural Networks Based on Systolic Array (ICPADS 2017)
33 | 
34 | ### 2019
35 | * An Energy-Aware Bit-Serial Streaming Deep Comvolutional Neural Network Accelerator (ICIP 2019)
36 | 
37 | ## Journal Paper
38 | ### 2016
39 | * Power-Efficient Accelerator Design for Neural Networks Using Computation Reuse (IEEE Computer Architecture Letters 2016 Jan.-June)
40 | 
41 | ### 2017
42 | * Stripes: Bit-Serial Deep Neural Network Computing (IEEE Computer Architecture Letters 2017 Jan.-June)
43 | * Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks (JSSC 2017 Jan.)
44 | * Embedded Streaming Deep Neural Networks Accelerator With Applications (TNNLS 2017 July)
45 | * Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns (TVLSI 2017 Aug.)
46 | * Origami: A 803-GOp/s/W Convolutional Network Accelerator (TCSVT 2017 Nov.)
47 | 
48 | ### 2018
49 | * Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA (TCAD 2018 Jan.)
50 | * A Reconfigurable Streaming Deep Convolutional Neural Network Accelerator for Internet of Things (TCSI 2018 Jan.)
51 | * An Architecture to Accelerate Convolution in Deep Neural Networks (TCSI 2018 April)
52 | * Data and Hardware Efficient Design for Convolutional Neural Network (TCSI 2018 May)
53 | * Efficient Hardware Architectures for Deep Convolutional Neural Network (TCSI 2018 June)
54 | * Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA (TVLSI 2018 Early Access)
55 | 
56 | ## Accelerator with quantization technique discussed in the paper
57 | * Going Deeper with Embedded FPGA Platform for Convolutional Neural Network (FPGA 2016)
58 | * Angel-Eye: A Complete Design Flow for Mapping CNN onto Embedded FPGA (ISVLSI 2016)(TCAD 2018 Jan.)
59 | 
60 | ## Paper about bit reduction
61 | * An Analytical Method to Determine Minimum Per-Layer Precision of Deep Neural Networks (ICASSP 2018)
62 | * True-Gradient Based Training of Deep Binary Activated Neural Networks via Continuous Binarization (ICASSP 2018)
63 | 
64 | ## Serial Approach Architecture
65 | * Bit-Pragmatic Deep Neural Network Computing (2016)
66 | * Stripes: Bit-serial deep neural network computing (MICRO 2016)
67 | * Stripes: Bit-Serial Deep Neural Network Computing (IEEE Computer Architecture Letters 2017 Jan.-June)
68 | * Dynamic Stripes：Exploiting the Dynamic Precision Requirements of Activation Values in Neural Networks (2017)
69 | * Value-Based Deep-Learning Acceleration (IEEE Micro 2018 Jan./Feb.)
70 | * Exploiting Typical Values to Accelerate Deep Learning (Computer 2018 May)
71 | * Loom：Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks (DAC 2018)
72 | 
73 | ## Zero-skipping series Architecture
74 | * Cnvlutin：Ineffectual-Neuron-Free Deep Neural Network Computing (ISCA 2016)
75 | * Cnvlutin2：Ineffectual-Activation-and-Weight-Free Deep Neural Network Computing (2017)
76 | 


--------------------------------------------------------------------------------