├── end_to_end_people_detection_in_crowded_scenes.jpg
└── readme.md


/end_to_end_people_detection_in_crowded_scenes.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Smorodov/Deep-learning-object-detection-links./365150b5623e1c1ba8eedb3a4e368c54dab143fc/end_to_end_people_detection_in_crowded_scenes.jpg


--------------------------------------------------------------------------------
/readme.md:
--------------------------------------------------------------------------------
   1 | ## Object Detection
   2 | 
   3 | |Method|VOC2007|VOC2010|VOC2012|ILSVRC 2013|MSCOCO 2015|Speed|
   4 | |--- |--- |--- |--- |--- |--- |--- |
   5 | |OverFeat|-|-|-|24.3%|-|-|
   6 | |R-CNN (AlexNet)|58.5%|53.7%|53.3%|31.4%|-|-|
   7 | |R-CNN (VGG16)|66.0%|-|-|-|-|-|
   8 | |SPP_net(ZF-5)|54.2%(1-model), 60.9%(2-model)|-|-|31.84%(1-model), 35.11%(6-model)|-|-|
   9 | |DeepID-Net|64.1%|-|-|50.3%|-|-|
  10 | |NoC|73.3%|-|68.8%|-|-|-|
  11 | |Fast-RCNN (VGG16)|70.0%|68.8%|68.4%|-|19.7%(@[0.5-0.95]), 35.9%(@0.5)|-|
  12 | |MR-CNN|78.2%|-|73.9%|-|-|-|
  13 | |Faster-RCNN (VGG16)|78.8%|-|75.9%|-|21.9%(@[0.5-0.95]), 42.7%(@0.5)|198ms|
  14 | |Faster-RCNN (ResNet-101)|85.6%|-|83.8%|-|37.4%(@[0.5-0.95]), 59.0%(@0.5)|-|
  15 | |SSD300 (VGG16)|72.1%|-|-|-|-|58 fps|
  16 | |SSD500 (VGG16)|75.1%|-|-|-|-|23 fps|
  17 | |ION|79.2%|-|76.4%|-|-|-|
  18 | |CRAFT|75.7%|-|71.3%|48.5%|-|-|
  19 | |OHEM|78.9%|-|76.3%|-|25.5%(@[0.5-0.95]), 45.9%(@0.5)|-|
  20 | |R-FCN (ResNet-50)|77.4%|-|-|-|-|0.12sec(K40), 0.09sec(TitianX)|
  21 | |R-FCN (ResNet-101)|79.5%|-|-|-|-|0.17sec(K40), 0.12sec(TitianX)|
  22 | |R-FCN (ResNet-101),multi sc train|83.6%|-|82.0%|-|31.5%(@[0.5-0.95]), 53.2%(@0.5)|-|
  23 | |PVANet 9.0|81.8%|-|82.5%|-|-|750ms(CPU), 46ms(TitianX)|
  24 | 
  25 | 
  26 | # Leaderboard
  27 | 
  28 | **Detection Results: VOC2012**
  29 | 
  30 | *   intro: Competition “comp4” (train on additional data)
  31 | *   homepage: [http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4)
  32 | 
  33 | # Papers
  34 | 
  35 | **Deep Neural Networks for Object Detection**
  36 | 
  37 | *   paper: [http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf](http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf)
  38 | 
  39 | **OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks**
  40 | 
  41 | *   arxiv: [http://arxiv.org/abs/1312.6229](http://arxiv.org/abs/1312.6229)
  42 | *   github: [https://github.com/sermanet/OverFeat](https://github.com/sermanet/OverFeat)
  43 | *   code: [http://cilvr.nyu.edu/doku.php?id=software:overfeat:start](http://cilvr.nyu.edu/doku.php?id=software:overfeat:start)
  44 | 
  45 | ## R-CNN
  46 | 
  47 | **Rich feature hierarchies for accurate object detection and semantic segmentation**
  48 | 
  49 | *   intro: R-CNN
  50 | *   arxiv: [http://arxiv.org/abs/1311.2524](http://arxiv.org/abs/1311.2524)
  51 | *   supp: [http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf](http://people.eecs.berkeley.edu/~rbg/papers/r-cnn-cvpr-supp.pdf)
  52 | *   slides: [http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf](http://www.image-net.org/challenges/LSVRC/2013/slides/r-cnn-ilsvrc2013-workshop.pdf)
  53 | *   slides: [http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf](http://www.cs.berkeley.edu/~rbg/slides/rcnn-cvpr14-slides.pdf)
  54 | *   github: [https://github.com/rbgirshick/rcnn](https://github.com/rbgirshick/rcnn)
  55 | *   notes: [http://zhangliliang.com/2014/07/23/paper-note-rcnn/](http://zhangliliang.com/2014/07/23/paper-note-rcnn/)
  56 | *   caffe-pr(“Make R-CNN the Caffe detection example”): [https://github.com/BVLC/caffe/pull/482](https://github.com/BVLC/caffe/pull/482)
  57 | 
  58 | ## MultiBox
  59 | 
  60 | **Scalable Object Detection using Deep Neural Networks**
  61 | 
  62 | *   intro: first MultiBox. Train a CNN to predict Region of Interest.
  63 | *   arxiv: [http://arxiv.org/abs/1312.2249](http://arxiv.org/abs/1312.2249)
  64 | *   github: [https://github.com/google/multibox](https://github.com/google/multibox)
  65 | *   blog: [https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html](https://research.googleblog.com/2014/12/high-quality-object-detection-at-scale.html)
  66 | 
  67 | **Scalable, High-Quality Object Detection**
  68 | 
  69 | *   intro: second MultiBox
  70 | *   arxiv: [http://arxiv.org/abs/1412.1441](http://arxiv.org/abs/1412.1441)
  71 | *   github: [https://github.com/google/multibox](https://github.com/google/multibox)
  72 | 
  73 | ## SPP-Net
  74 | 
  75 | **Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition**
  76 | 
  77 | *   intro: ECCV 2014 / TPAMI 2015
  78 | *   arxiv: [http://arxiv.org/abs/1406.4729](http://arxiv.org/abs/1406.4729)
  79 | *   github: [https://github.com/ShaoqingRen/SPP_net](https://github.com/ShaoqingRen/SPP_net)
  80 | *   notes: [http://zhangliliang.com/2014/09/13/paper-note-sppnet/](http://zhangliliang.com/2014/09/13/paper-note-sppnet/)
  81 | 
  82 | ## DeepID-Net
  83 | 
  84 | **DeepID-Net: Deformable Deep Convolutional Neural Networks for Object Detection**
  85 | 
  86 | *   intro: PAMI 2016
  87 | *   intro: an extension of R-CNN. box pre-training, cascade on region proposals, deformation layers and context representations
  88 | *   project page: [http://www.ee.cuhk.edu.hk/%CB%9Cwlouyang/projects/imagenetDeepId/index.html](http://www.ee.cuhk.edu.hk/%CB%9Cwlouyang/projects/imagenetDeepId/index.html)
  89 | *   arxiv: [http://arxiv.org/abs/1412.5661](http://arxiv.org/abs/1412.5661)
  90 | 
  91 | **Object Detectors Emerge in Deep Scene CNNs**
  92 | 
  93 | *   arxiv: [http://arxiv.org/abs/1412.6856](http://arxiv.org/abs/1412.6856)
  94 | *   paper: [https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf](https://www.robots.ox.ac.uk/~vgg/rg/papers/zhou_iclr15.pdf)
  95 | *   paper: [https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf](https://people.csail.mit.edu/khosla/papers/iclr2015_zhou.pdf)
  96 | *   slides: [http://places.csail.mit.edu/slide_iclr2015.pdf](http://places.csail.mit.edu/slide_iclr2015.pdf)
  97 | 
  98 | **segDeepM: Exploiting Segmentation and Context in Deep Neural Networks for Object Detection**
  99 | 
 100 | *   intro: CVPR 2015
 101 | *   project(code+data): [https://www.cs.toronto.edu/~yukun/segdeepm.html](https://www.cs.toronto.edu/~yukun/segdeepm.html)
 102 | *   arxiv: [https://arxiv.org/abs/1502.04275](https://arxiv.org/abs/1502.04275)
 103 | *   github: [https://github.com/YknZhu/segDeepM](https://github.com/YknZhu/segDeepM)
 104 | 
 105 | ## NoC
 106 | 
 107 | **Object Detection Networks on Convolutional Feature Maps**
 108 | 
 109 | *   intro: TPAMI 2015
 110 | *   arxiv: [http://arxiv.org/abs/1504.06066](http://arxiv.org/abs/1504.06066)
 111 | 
 112 | **Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction**
 113 | 
 114 | *   arxiv: [http://arxiv.org/abs/1504.03293](http://arxiv.org/abs/1504.03293)
 115 | *   slides: [http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf](http://www.ytzhang.net/files/publications/2015-cvpr-det-slides.pdf)
 116 | *   github: [https://github.com/YutingZhang/fgs-obj](https://github.com/YutingZhang/fgs-obj)
 117 | 
 118 | ## Fast R-CNN
 119 | 
 120 | **Fast R-CNN**
 121 | 
 122 | *   arxiv: [http://arxiv.org/abs/1504.08083](http://arxiv.org/abs/1504.08083)
 123 | *   slides: [http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf](http://tutorial.caffe.berkeleyvision.org/caffe-cvpr15-detection.pdf)
 124 | *   github: [https://github.com/rbgirshick/fast-rcnn](https://github.com/rbgirshick/fast-rcnn)
 125 | *   webcam demo: [https://github.com/rbgirshick/fast-rcnn/pull/29](https://github.com/rbgirshick/fast-rcnn/pull/29)
 126 | *   notes: [http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/](http://zhangliliang.com/2015/05/17/paper-note-fast-rcnn/)
 127 | *   notes: [http://blog.csdn.net/linj_m/article/details/48930179](http://blog.csdn.net/linj_m/article/details/48930179)
 128 | *   github(“Fast R-CNN in MXNet”): [https://github.com/precedenceguo/mx-rcnn](https://github.com/precedenceguo/mx-rcnn)
 129 | *   github: [https://github.com/mahyarnajibi/fast-rcnn-torch](https://github.com/mahyarnajibi/fast-rcnn-torch)
 130 | *   github: [https://github.com/apple2373/chainer-simple-fast-rnn](https://github.com/apple2373/chainer-simple-fast-rnn)
 131 | *   github(Tensorflow): [https://github.com/zplizzi/tensorflow-fast-rcnn](https://github.com/zplizzi/tensorflow-fast-rcnn)
 132 | 
 133 | ## DeepBox
 134 | 
 135 | **DeepBox: Learning Objectness with Convolutional Networks**
 136 | 
 137 | *   arxiv: [http://arxiv.org/abs/1505.02146](http://arxiv.org/abs/1505.02146)
 138 | *   github: [https://github.com/weichengkuo/DeepBox](https://github.com/weichengkuo/DeepBox)
 139 | 
 140 | ## MR-CNN
 141 | 
 142 | **Object detection via a multi-region & semantic segmentation-aware CNN model**
 143 | 
 144 | *   intro: ICCV 2015\. MR-CNN
 145 | *   arxiv: [http://arxiv.org/abs/1505.01749](http://arxiv.org/abs/1505.01749)
 146 | *   github: [https://github.com/gidariss/mrcnn-object-detection](https://github.com/gidariss/mrcnn-object-detection)
 147 | *   notes: [http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/](http://zhangliliang.com/2015/05/17/paper-note-ms-cnn/)
 148 | *   notes: [http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/](http://blog.cvmarcher.com/posts/2015/05/17/multi-region-semantic-segmentation-aware-cnn/)
 149 | *   my notes: Who can tell me why there are a bunch of duplicated sentences in section 7.2 “Detection error analysis”? :-D
 150 | 
 151 | ## Faster R-CNN
 152 | 
 153 | **Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks**
 154 | 
 155 | *   intro: NIPS 2015
 156 | *   arxiv: [http://arxiv.org/abs/1506.01497](http://arxiv.org/abs/1506.01497)
 157 | *   gitxiv: [http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region](http://www.gitxiv.com/posts/8pfpcvefDYn2gSgXk/faster-r-cnn-towards-real-time-object-detection-with-region)
 158 | *   slides: [http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf](http://web.cs.hacettepe.edu.tr/~aykut/classes/spring2016/bil722/slides/w05-FasterR-CNN.pdf)
 159 | *   github: [https://github.com/ShaoqingRen/faster_rcnn](https://github.com/ShaoqingRen/faster_rcnn)
 160 | *   github: [https://github.com/rbgirshick/py-faster-rcnn](https://github.com/rbgirshick/py-faster-rcnn)
 161 | *   github: [https://github.com/mitmul/chainer-faster-rcnn](https://github.com/mitmul/chainer-faster-rcnn)
 162 | *   github(Torch): [https://github.com/andreaskoepf/faster-rcnn.torch](https://github.com/andreaskoepf/faster-rcnn.torch)
 163 | *   github(Torch): [https://github.com/ruotianluo/Faster-RCNN-Densecap-torch](https://github.com/ruotianluo/Faster-RCNN-Densecap-torch)
 164 | *   github(Tensorflow): [https://github.com/smallcorgi/Faster-RCNN_TF](https://github.com/smallcorgi/Faster-RCNN_TF)
 165 | *   github(Tensorflow): [https://github.com/CharlesShang/TFFRCNN](https://github.com/CharlesShang/TFFRCNN)
 166 | 
 167 | **Faster R-CNN in MXNet with distributed implementation and data parallelization**
 168 | 
 169 | *   github: [https://github.com/dmlc/mxnet/tree/master/example/rcnn](https://github.com/dmlc/mxnet/tree/master/example/rcnn)
 170 | 
 171 | **Contextual Priming and Feedback for Faster R-CNN**
 172 | 
 173 | *   intro: ECCV 2016\. Carnegie Mellon University
 174 | *   paper: [http://abhinavsh.info/context_priming_feedback.pdf](http://abhinavsh.info/context_priming_feedback.pdf)
 175 | *   poster: [http://www.eccv2016.org/files/posters/P-1A-20.pdf](http://www.eccv2016.org/files/posters/P-1A-20.pdf)
 176 | 
 177 | **An Implementation of Faster RCNN with Study for Region Sampling**
 178 | 
 179 | *   intro: Technical Report, 3 pages. CMU
 180 | *   arxiv: [https://arxiv.org/abs/1702.02138](https://arxiv.org/abs/1702.02138)
 181 | *   github: [https://github.com/endernewton/tf-faster-rcnn](https://github.com/endernewton/tf-faster-rcnn)
 182 | 
 183 | ## YOLO
 184 | 
 185 | **You Only Look Once: Unified, Real-Time Object Detection**
 186 | 
 187 | ![](https://camo.githubusercontent.com/e69d4118b20a42de4e23b9549f9a6ec6dbbb0814/687474703a2f2f706a7265646469652e636f6d2f6d656469612f66696c65732f6461726b6e65742d626c61636b2d736d616c6c2e706e67)
 188 | 
 189 | *   arxiv: [http://arxiv.org/abs/1506.02640](http://arxiv.org/abs/1506.02640)
 190 | *   code: [http://pjreddie.com/darknet/yolo/](http://pjreddie.com/darknet/yolo/)
 191 | *   github: [https://github.com/pjreddie/darknet](https://github.com/pjreddie/darknet)
 192 | *   reddit: [https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/](https://www.reddit.com/r/MachineLearning/comments/3a3m0o/realtime_object_detection_with_yolo/)
 193 | *   github: [https://github.com/gliese581gg/YOLO_tensorflow](https://github.com/gliese581gg/YOLO_tensorflow)
 194 | *   github: [https://github.com/xingwangsfu/caffe-yolo](https://github.com/xingwangsfu/caffe-yolo)
 195 | *   github: [https://github.com/frankzhangrui/Darknet-Yolo](https://github.com/frankzhangrui/Darknet-Yolo)
 196 | *   github: [https://github.com/BriSkyHekun/py-darknet-yolo](https://github.com/BriSkyHekun/py-darknet-yolo)
 197 | *   github: [https://github.com/tommy-qichang/yolo.torch](https://github.com/tommy-qichang/yolo.torch)
 198 | *   github: [https://github.com/frischzenger/yolo-windows](https://github.com/frischzenger/yolo-windows)
 199 | *   gtihub: [https://github.com/AlexeyAB/yolo-windows](https://github.com/AlexeyAB/yolo-windows)
 200 | 
 201 | **darkflow - translate darknet to tensorflow. Load trained weights, retrain/fine-tune them using tensorflow, export constant graph def to C++**
 202 | 
 203 | *   blog: [https://thtrieu.github.io/notes/yolo-tensorflow-graph-buffer-cpp](https://thtrieu.github.io/notes/yolo-tensorflow-graph-buffer-cpp)
 204 | *   github: [https://github.com/thtrieu/darkflow](https://github.com/thtrieu/darkflow)
 205 | 
 206 | **Start Training YOLO with Our Own Data**
 207 | 
 208 | ![](http://guanghan.info/blog/en/wp-content/uploads/2015/12/images-40.jpg)
 209 | 
 210 | *   intro: train with customized data and class numbers/labels. Linux / Windows version for darknet.
 211 | *   blog: [http://guanghan.info/blog/en/my-works/train-yolo/](http://guanghan.info/blog/en/my-works/train-yolo/)
 212 | *   github: [https://github.com/Guanghan/darknet](https://github.com/Guanghan/darknet)
 213 | 
 214 | **R-CNN minus R**
 215 | 
 216 | *   arxiv: [http://arxiv.org/abs/1506.06981](http://arxiv.org/abs/1506.06981)
 217 | 
 218 | ## AttentionNet
 219 | 
 220 | **AttentionNet: Aggregating Weak Directions for Accurate Object Detection**
 221 | 
 222 | *   intro: ICCV 2015
 223 | *   intro: state-of-the-art performance of 65% (AP) on PASCAL VOC 2007/2012 human detection task
 224 | *   arxiv: [http://arxiv.org/abs/1506.07704](http://arxiv.org/abs/1506.07704)
 225 | *   slides: [https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf](https://www.robots.ox.ac.uk/~vgg/rg/slides/AttentionNet.pdf)
 226 | *   slides: [http://image-net.org/challenges/talks/lunit-kaist-slide.pdf](http://image-net.org/challenges/talks/lunit-kaist-slide.pdf)
 227 | 
 228 | ## DenseBox
 229 | 
 230 | **DenseBox: Unifying Landmark Localization with End to End Object Detection**
 231 | 
 232 | *   arxiv: [http://arxiv.org/abs/1509.04874](http://arxiv.org/abs/1509.04874)
 233 | *   demo: [http://pan.baidu.com/s/1mgoWWsS](http://pan.baidu.com/s/1mgoWWsS)
 234 | *   KITTI result: [http://www.cvlibs.net/datasets/kitti/eval_object.php](http://www.cvlibs.net/datasets/kitti/eval_object.php)
 235 | 
 236 | ## SSD
 237 | 
 238 | **SSD: Single Shot MultiBox Detector**
 239 | 
 240 | ![](https://camo.githubusercontent.com/ad9b147ed3a5f48ffb7c3540711c15aa04ce49c6/687474703a2f2f7777772e63732e756e632e6564752f7e776c69752f7061706572732f7373642e706e67)
 241 | 
 242 | *   intro: ECCV 2016 Oral
 243 | *   arxiv: [http://arxiv.org/abs/1512.02325](http://arxiv.org/abs/1512.02325)
 244 | *   paper: [http://www.cs.unc.edu/~wliu/papers/ssd.pdf](http://www.cs.unc.edu/~wliu/papers/ssd.pdf)
 245 | *   slides: [http://www.cs.unc.edu/%7Ewliu/papers/ssd_eccv2016_slide.pdf](http://www.cs.unc.edu/%7Ewliu/papers/ssd_eccv2016_slide.pdf)
 246 | *   github: [https://github.com/weiliu89/caffe/tree/ssd](https://github.com/weiliu89/caffe/tree/ssd)
 247 | *   video: [http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973](http://weibo.com/p/2304447a2326da963254c963c97fb05dd3a973)
 248 | *   github(MXNet): [https://github.com/zhreshold/mxnet-ssd](https://github.com/zhreshold/mxnet-ssd)
 249 | *   github: [https://github.com/zhreshold/mxnet-ssd.cpp](https://github.com/zhreshold/mxnet-ssd.cpp)
 250 | *   github(Keras): [https://github.com/rykov8/ssd_keras](https://github.com/rykov8/ssd_keras)
 251 | 
 252 | ## Inside-Outside Net (ION)
 253 | 
 254 | **Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks**
 255 | 
 256 | *   intro: “0.8s per image on a Titan X GPU (excluding proposal generation) without two-stage bounding-box regression and 1.15s per image with it”.
 257 | *   arxiv: [http://arxiv.org/abs/1512.04143](http://arxiv.org/abs/1512.04143)
 258 | *   slides: [http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf](http://www.seanbell.ca/tmp/ion-coco-talk-bell2015.pdf)
 259 | *   coco-leaderboard: [http://mscoco.org/dataset/#detections-leaderboard](http://mscoco.org/dataset/#detections-leaderboard)
 260 | 
 261 | **Adaptive Object Detection Using Adjacency and Zoom Prediction**
 262 | 
 263 | *   intro: CVPR 2016\. AZ-Net
 264 | *   arxiv: [http://arxiv.org/abs/1512.07711](http://arxiv.org/abs/1512.07711)
 265 | *   github: [https://github.com/luyongxi/az-net](https://github.com/luyongxi/az-net)
 266 | *   youtube: [https://www.youtube.com/watch?v=YmFtuNwxaNM](https://www.youtube.com/watch?v=YmFtuNwxaNM)
 267 | 
 268 | ## G-CNN
 269 | 
 270 | **G-CNN: an Iterative Grid Based Object Detector**
 271 | 
 272 | *   arxiv: [http://arxiv.org/abs/1512.07729](http://arxiv.org/abs/1512.07729)
 273 | 
 274 | **Factors in Finetuning Deep Model for object detection**
 275 | 
 276 | **Factors in Finetuning Deep Model for Object Detection with Long-tail Distribution**
 277 | 
 278 | *   intro: CVPR 2016.rank 3rd for provided data and 2nd for external data on ILSVRC 2015 object detection
 279 | *   project page: [http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html](http://www.ee.cuhk.edu.hk/~wlouyang/projects/ImageNetFactors/CVPR16.html)
 280 | *   arxiv: [http://arxiv.org/abs/1601.05150](http://arxiv.org/abs/1601.05150)
 281 | 
 282 | **We don’t need no bounding-boxes: Training object class detectors using only human verification**
 283 | 
 284 | *   arxiv: [http://arxiv.org/abs/1602.08405](http://arxiv.org/abs/1602.08405)
 285 | 
 286 | ## HyperNet
 287 | 
 288 | **HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection**
 289 | 
 290 | *   arxiv: [http://arxiv.org/abs/1604.00600](http://arxiv.org/abs/1604.00600)
 291 | 
 292 | ## MultiPathNet
 293 | 
 294 | **A MultiPath Network for Object Detection**
 295 | 
 296 | *   intro: BMVC 2016\. Facebook AI Research (FAIR)
 297 | *   arxiv: [http://arxiv.org/abs/1604.02135](http://arxiv.org/abs/1604.02135)
 298 | *   github: [https://github.com/facebookresearch/multipathnet](https://github.com/facebookresearch/multipathnet)
 299 | 
 300 | ## CRAFT
 301 | 
 302 | **CRAFT Objects from Images**
 303 | 
 304 | *   intro: CVPR 2016\. Cascade Region-proposal-network And FasT-rcnn. an extension of Faster R-CNN
 305 | *   project page: [http://byangderek.github.io/projects/craft.html](http://byangderek.github.io/projects/craft.html)
 306 | *   arxiv: [https://arxiv.org/abs/1604.03239](https://arxiv.org/abs/1604.03239)
 307 | *   paper: [http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf](http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Yang_CRAFT_Objects_From_CVPR_2016_paper.pdf)
 308 | *   github: [https://github.com/byangderek/CRAFT](https://github.com/byangderek/CRAFT)
 309 | 
 310 | ## OHEM
 311 | 
 312 | **Training Region-based Object Detectors with Online Hard Example Mining**
 313 | 
 314 | *   intro: CVPR 2016 Oral. Online hard example mining (OHEM)
 315 | *   arxiv: [http://arxiv.org/abs/1604.03540](http://arxiv.org/abs/1604.03540)
 316 | *   paper: [http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf](http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf)
 317 | *   github（Official）: [https://github.com/abhi2610/ohem](https://github.com/abhi2610/ohem)
 318 | *   author page: [http://abhinav-shrivastava.info/](http://abhinav-shrivastava.info/)
 319 | 
 320 | **Track and Transfer: Watching Videos to Simulate Strong Human Supervision for Weakly-Supervised Object Detection**
 321 | 
 322 | *   intro: CVPR 2016
 323 | *   arxiv: [http://arxiv.org/abs/1604.05766](http://arxiv.org/abs/1604.05766)
 324 | 
 325 | **Exploit All the Layers: Fast and Accurate CNN Object Detector with Scale Dependent Pooling and Cascaded Rejection Classifiers**
 326 | 
 327 | *   intro: scale-dependent pooling (SDP), cascaded rejection clas-sifiers (CRC)
 328 | *   paper: [http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf](http://www-personal.umich.edu/~wgchoi/SDP-CRC_camready.pdf)
 329 | 
 330 | ## R-FCN
 331 | 
 332 | **R-FCN: Object Detection via Region-based Fully Convolutional Networks**
 333 | 
 334 | *   arxiv: [http://arxiv.org/abs/1605.06409](http://arxiv.org/abs/1605.06409)
 335 | *   github: [https://github.com/daijifeng001/R-FCN](https://github.com/daijifeng001/R-FCN)
 336 | *   github: [https://github.com/Orpine/py-R-FCN](https://github.com/Orpine/py-R-FCN)
 337 | 
 338 | **Weakly supervised object detection using pseudo-strong labels**
 339 | 
 340 | *   arxiv: [http://arxiv.org/abs/1607.04731](http://arxiv.org/abs/1607.04731)
 341 | 
 342 | **Recycle deep features for better object detection**
 343 | 
 344 | *   arxiv: [http://arxiv.org/abs/1607.05066](http://arxiv.org/abs/1607.05066)
 345 | 
 346 | ## MS-CNN
 347 | 
 348 | **A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection**
 349 | 
 350 | *   intro: ECCV 2016
 351 | *   intro: 640×480: 15 fps, 960×720: 8 fps
 352 | *   arxiv: [http://arxiv.org/abs/1607.07155](http://arxiv.org/abs/1607.07155)
 353 | *   github: [https://github.com/zhaoweicai/mscnn](https://github.com/zhaoweicai/mscnn)
 354 | *   poster: [http://www.eccv2016.org/files/posters/P-2B-38.pdf](http://www.eccv2016.org/files/posters/P-2B-38.pdf)
 355 | 
 356 | **Multi-stage Object Detection with Group Recursive Learning**
 357 | 
 358 | *   intro: VOC2007: 78.6%, VOC2012: 74.9%
 359 | *   arxiv: [http://arxiv.org/abs/1608.05159](http://arxiv.org/abs/1608.05159)
 360 | 
 361 | **Subcategory-aware Convolutional Neural Networks for Object Proposals and Detection**
 362 | 
 363 | *   intro: WACV 2017\. SubCNN
 364 | *   arxiv: [http://arxiv.org/abs/1604.04693](http://arxiv.org/abs/1604.04693)
 365 | *   github: [https://github.com/yuxng/SubCNN](https://github.com/yuxng/SubCNN)
 366 | 
 367 | ## PVANET
 368 | 
 369 | **PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection**
 370 | 
 371 | *   intro: “less channels with more layers”, concatenated ReLU, Inception, and HyperNet, batch normalization, residual connections
 372 | *   arxiv: [http://arxiv.org/abs/1608.08021](http://arxiv.org/abs/1608.08021)
 373 | *   github: [https://github.com/sanghoon/pva-faster-rcnn](https://github.com/sanghoon/pva-faster-rcnn)
 374 | *   leaderboard(PVANet 9.0): [http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=4)
 375 | 
 376 | **PVANet: Lightweight Deep Neural Networks for Real-time Object Detection**
 377 | 
 378 | *   intro: Presented at NIPS 2016 Workshop on Efficient Methods for Deep Neural Networks (EMDNN). Continuation of [arXiv:1608.08021](https://arxiv.org/abs/1608.08021)
 379 | *   arxiv: [https://arxiv.org/abs/1611.08588](https://arxiv.org/abs/1611.08588)
 380 | 
 381 | ## GBD-Net
 382 | 
 383 | **Gated Bi-directional CNN for Object Detection**
 384 | 
 385 | *   intro: The Chinese University of Hong Kong & Sensetime Group Limited
 386 | *   paper: [http://link.springer.com/chapter/10.1007/978-3-319-46478-7_22](http://link.springer.com/chapter/10.1007/978-3-319-46478-7_22)
 387 | *   mirror: [https://pan.baidu.com/s/1dFohO7v](https://pan.baidu.com/s/1dFohO7v)
 388 | 
 389 | **Crafting GBD-Net for Object Detection**
 390 | 
 391 | *   intro: winner of the ImageNet object detection challenge of 2016\. CUImage and CUVideo
 392 | *   intro: gated bi-directional CNN (GBD-Net)
 393 | *   arxiv: [https://arxiv.org/abs/1610.02579](https://arxiv.org/abs/1610.02579)
 394 | *   github: [https://github.com/craftGBD/craftGBD](https://github.com/craftGBD/craftGBD)
 395 | 
 396 | ## StuffNet
 397 | 
 398 | **StuffNet: Using ‘Stuff’ to Improve Object Detection**
 399 | 
 400 | *   arxiv: [https://arxiv.org/abs/1610.05861](https://arxiv.org/abs/1610.05861)
 401 | 
 402 | **Generalized Haar Filter based Deep Networks for Real-Time Object Detection in Traffic Scene**
 403 | 
 404 | *   arxiv: [https://arxiv.org/abs/1610.09609](https://arxiv.org/abs/1610.09609)
 405 | 
 406 | **Hierarchical Object Detection with Deep Reinforcement Learning**
 407 | 
 408 | *   intro: Deep Reinforcement Learning Workshop (NIPS 2016)
 409 | *   project page: [https://imatge-upc.github.io/detection-2016-nipsws/](https://imatge-upc.github.io/detection-2016-nipsws/)
 410 | *   arxiv: [https://arxiv.org/abs/1611.03718](https://arxiv.org/abs/1611.03718)
 411 | *   slides: [http://www.slideshare.net/xavigiro/hierarchical-object-detection-with-deep-reinforcement-learning](http://www.slideshare.net/xavigiro/hierarchical-object-detection-with-deep-reinforcement-learning)
 412 | *   github: [https://github.com/imatge-upc/detection-2016-nipsws](https://github.com/imatge-upc/detection-2016-nipsws)
 413 | *   blog: [http://jorditorres.org/nips/](http://jorditorres.org/nips/)
 414 | 
 415 | **Learning to detect and localize many objects from few examples**
 416 | 
 417 | *   arxiv: [https://arxiv.org/abs/1611.05664](https://arxiv.org/abs/1611.05664)
 418 | 
 419 | **Speed/accuracy trade-offs for modern convolutional object detectors**
 420 | 
 421 | *   intro: Google Research
 422 | *   arxiv: [https://arxiv.org/abs/1611.10012](https://arxiv.org/abs/1611.10012)
 423 | 
 424 | **SqueezeDet: Unified, Small, Low Power Fully Convolutional Neural Networks for Real-Time Object Detection for Autonomous Driving**
 425 | 
 426 | *   arxiv: [https://arxiv.org/abs/1612.01051](https://arxiv.org/abs/1612.01051)
 427 | *   github: [https://github.com/BichenWuUCB/squeezeDet](https://github.com/BichenWuUCB/squeezeDet)
 428 | 
 429 | ## Feature Pyramid Network (FPN)
 430 | 
 431 | **Feature Pyramid Networks for Object Detection**
 432 | 
 433 | *   intro: Facebook AI Research
 434 | *   arxiv: [https://arxiv.org/abs/1612.03144](https://arxiv.org/abs/1612.03144)
 435 | 
 436 | **Action-Driven Object Detection with Top-Down Visual Attentions**
 437 | 
 438 | *   arxiv: [https://arxiv.org/abs/1612.06704](https://arxiv.org/abs/1612.06704)
 439 | 
 440 | **Beyond Skip Connections: Top-Down Modulation for Object Detection**
 441 | 
 442 | *   intro: CMU & UC Berkeley & Google Research
 443 | *   arxiv: [https://arxiv.org/abs/1612.06851](https://arxiv.org/abs/1612.06851)
 444 | 
 445 | ## YOLOv2
 446 | 
 447 | **YOLO9000: Better, Faster, Stronger**
 448 | 
 449 | *   arxiv: [https://arxiv.org/abs/1612.08242](https://arxiv.org/abs/1612.08242)
 450 | *   code: [http://pjreddie.com/yolo9000/](http://pjreddie.com/yolo9000/)
 451 | *   github(Chainer): [https://github.com/leetenki/YOLOv2](https://github.com/leetenki/YOLOv2)
 452 | 
 453 | ## DSSD
 454 | 
 455 | **DSSD : Deconvolutional Single Shot Detector**
 456 | 
 457 | *   intro: UNC Chapel Hill & Amazon Inc
 458 | *   arxiv: [https://arxiv.org/abs/1701.06659](https://arxiv.org/abs/1701.06659)
 459 | 
 460 | **Wide-Residual-Inception Networks for Real-time Object Detection**
 461 | 
 462 | *   intro: Inha University
 463 | *   arxiv: [https://arxiv.org/abs/1702.01243](https://arxiv.org/abs/1702.01243)
 464 | 
 465 | **Attentional Network for Visual Object Detection**
 466 | 
 467 | *   intro: University of Maryland & Mitsubishi Electric Research Laboratories
 468 | *   arxiv: [https://arxiv.org/abs/1702.01478](https://arxiv.org/abs/1702.01478)
 469 | 
 470 | # Detection From Video
 471 | 
 472 | **Learning Object Class Detectors from Weakly Annotated Video**
 473 | 
 474 | *   intro: CVPR 2012
 475 | *   paper: [https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf](https://www.vision.ee.ethz.ch/publications/papers/proceedings/eth_biwi_00905.pdf)
 476 | 
 477 | **Analysing domain shift factors between videos and images for object detection**
 478 | 
 479 | *   arxiv: [https://arxiv.org/abs/1501.01186](https://arxiv.org/abs/1501.01186)
 480 | 
 481 | **Video Object Recognition**
 482 | 
 483 | *   slides: [http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video%20Object%20Recognition.pptx](http://vision.princeton.edu/courses/COS598/2015sp/slides/VideoRecog/Video%20Object%20Recognition.pptx)
 484 | 
 485 | **Deep Learning for Saliency Prediction in Natural Video**
 486 | 
 487 | *   intro: Submitted on 12 Jan 2016
 488 | *   keywords: Deep learning, saliency map, optical flow, convolution network, contrast features
 489 | *   paper: [https://hal.archives-ouvertes.fr/hal-01251614/document](https://hal.archives-ouvertes.fr/hal-01251614/document)
 490 | 
 491 | ## T-CNN
 492 | 
 493 | **T-CNN: Tubelets with Convolutional Neural Networks for Object Detection from Videos**
 494 | 
 495 | *   intro: Winning solution in ILSVRC2015 Object Detection from Video(VID) Task
 496 | *   arxiv: [http://arxiv.org/abs/1604.02532](http://arxiv.org/abs/1604.02532)
 497 | *   github: [https://github.com/myfavouritekk/T-CNN](https://github.com/myfavouritekk/T-CNN)
 498 | 
 499 | **Object Detection from Video Tubelets with Convolutional Neural Networks**
 500 | 
 501 | *   intro: CVPR 2016 Spotlight paper
 502 | *   arxiv: [https://arxiv.org/abs/1604.04053](https://arxiv.org/abs/1604.04053)
 503 | *   paper: [http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf](http://www.ee.cuhk.edu.hk/~wlouyang/Papers/KangVideoDet_CVPR16.pdf)
 504 | *   gihtub: [https://github.com/myfavouritekk/vdetlib](https://github.com/myfavouritekk/vdetlib)
 505 | 
 506 | **Object Detection in Videos with Tubelets and Multi-context Cues**
 507 | 
 508 | *   intro: SenseTime Group
 509 | *   slides: [http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf](http://www.ee.cuhk.edu.hk/~xgwang/CUvideo.pdf)
 510 | *   slides: [http://image-net.org/challenges/talks/Object%20Detection%20in%20Videos%20with%20Tubelets%20and%20Multi-context%20Cues%20-%20Final.pdf](http://image-net.org/challenges/talks/Object%20Detection%20in%20Videos%20with%20Tubelets%20and%20Multi-context%20Cues%20-%20Final.pdf)
 511 | 
 512 | **Context Matters: Refining Object Detection in Video with Recurrent Neural Networks**
 513 | 
 514 | *   intro: BMVC 2016
 515 | *   keywords: pseudo-labeler
 516 | *   arxiv: [http://arxiv.org/abs/1607.04648](http://arxiv.org/abs/1607.04648)
 517 | *   paper: [http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf](http://vision.cornell.edu/se3/wp-content/uploads/2016/07/video_object_detection_BMVC.pdf)
 518 | 
 519 | **CNN Based Object Detection in Large Video Images**
 520 | 
 521 | *   intro: WangTao @ 爱奇艺
 522 | *   keywords: object retrieval, object detection, scene classification
 523 | *   slides: [http://on-demand.gputechconf.com/gtc/2016/presentation/s6362-wang-tao-cnn-based-object-detection-large-video-images.pdf](http://on-demand.gputechconf.com/gtc/2016/presentation/s6362-wang-tao-cnn-based-object-detection-large-video-images.pdf)
 524 | 
 525 | ## Datasets
 526 | 
 527 | **YouTube-Objects dataset v2.2**
 528 | 
 529 | *   homepage: [http://calvin.inf.ed.ac.uk/datasets/youtube-objects-dataset/](http://calvin.inf.ed.ac.uk/datasets/youtube-objects-dataset/)
 530 | 
 531 | **ILSVRC2015: Object detection from video (VID)**
 532 | 
 533 | *   homepage: [http://vision.cs.unc.edu/ilsvrc2015/download-videos-3j16.php#vid](http://vision.cs.unc.edu/ilsvrc2015/download-videos-3j16.php#vid)
 534 | 
 535 | # Object Detection in 3D
 536 | 
 537 | **Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks**
 538 | 
 539 | *   arxiv: [https://arxiv.org/abs/1609.06666](https://arxiv.org/abs/1609.06666)
 540 | 
 541 | # Object Detection on RGB-D
 542 | 
 543 | **Learning Rich Features from RGB-D Images for Object Detection and Segmentation**
 544 | 
 545 | *   arxiv: [http://arxiv.org/abs/1407.5736](http://arxiv.org/abs/1407.5736)
 546 | 
 547 | **Differential Geometry Boosts Convolutional Neural Networks for Object Detection**
 548 | 
 549 | *   intro: CVPR 2016
 550 | *   paper: [http://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w23/html/Wang_Differential_Geometry_Boosts_CVPR_2016_paper.html](http://www.cv-foundation.org/openaccess/content_cvpr_2016_workshops/w23/html/Wang_Differential_Geometry_Boosts_CVPR_2016_paper.html)
 551 | 
 552 | # Salient Object Detection
 553 | 
 554 | This task involves predicting the salient regions of an image given by human eye fixations.
 555 | 
 556 | **Best Deep Saliency Detection Models (CVPR 2016 & 2015)**
 557 | 
 558 | [http://i.cs.hku.hk/~yzyu/vision.html](http://i.cs.hku.hk/~yzyu/vision.html)
 559 | 
 560 | **Large-scale optimization of hierarchical features for saliency prediction in natural images**
 561 | 
 562 | *   paper: [http://coxlab.org/pdfs/cvpr2014_vig_saliency.pdf](http://coxlab.org/pdfs/cvpr2014_vig_saliency.pdf)
 563 | 
 564 | **Predicting Eye Fixations using Convolutional Neural Networks**
 565 | 
 566 | *   paper: [http://www.escience.cn/system/file?fileId=72648](http://www.escience.cn/system/file?fileId=72648)
 567 | 
 568 | **Saliency Detection by Multi-Context Deep Learning**
 569 | 
 570 | *   paper: [http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhao_Saliency_Detection_by_2015_CVPR_paper.pdf](http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhao_Saliency_Detection_by_2015_CVPR_paper.pdf)
 571 | 
 572 | **DeepSaliency: Multi-Task Deep Neural Network Model for Salient Object Detection**
 573 | 
 574 | *   arxiv: [http://arxiv.org/abs/1510.05484](http://arxiv.org/abs/1510.05484)
 575 | 
 576 | **SuperCNN: A Superpixelwise Convolutional Neural Network for Salient Object Detection**
 577 | 
 578 | *   paper: [www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html](www.shengfenghe.com/supercnn-a-superpixelwise-convolutional-neural-network-for-salient-object-detection.html)
 579 | 
 580 | **Shallow and Deep Convolutional Networks for Saliency Prediction**
 581 | 
 582 | *   arxiv: [http://arxiv.org/abs/1603.00845](http://arxiv.org/abs/1603.00845)
 583 | *   github: [https://github.com/imatge-upc/saliency-2016-cvpr](https://github.com/imatge-upc/saliency-2016-cvpr)
 584 | 
 585 | **Recurrent Attentional Networks for Saliency Detection**
 586 | 
 587 | *   intro: CVPR 2016\. recurrent attentional convolutional-deconvolution network (RACDNN)
 588 | *   arxiv: [http://arxiv.org/abs/1604.03227](http://arxiv.org/abs/1604.03227)
 589 | 
 590 | **Two-Stream Convolutional Networks for Dynamic Saliency Prediction**
 591 | 
 592 | *   arxiv: [http://arxiv.org/abs/1607.04730](http://arxiv.org/abs/1607.04730)
 593 | 
 594 | **Unconstrained Salient Object Detection**
 595 | 
 596 | **Unconstrained Salient Object Detection via Proposal Subset Optimization**
 597 | 
 598 | ![](http://cs-people.bu.edu/jmzhang/images/pasted%20image%201465x373.jpg)
 599 | 
 600 | *   intro: CVPR 2016
 601 | *   project page: [http://cs-people.bu.edu/jmzhang/sod.html](http://cs-people.bu.edu/jmzhang/sod.html)
 602 | *   paper: [http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf](http://cs-people.bu.edu/jmzhang/SOD/CVPR16SOD_camera_ready.pdf)
 603 | *   github: [https://github.com/jimmie33/SOD](https://github.com/jimmie33/SOD)
 604 | *   caffe model zoo: [https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection](https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-object-proposal-models-for-salient-object-detection)
 605 | 
 606 | **DHSNet: Deep Hierarchical Saliency Network for Salient Object Detection**
 607 | 
 608 | *   paper: [http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DHSNet_Deep_Hierarchical_CVPR_2016_paper.pdf](http://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Liu_DHSNet_Deep_Hierarchical_CVPR_2016_paper.pdf)
 609 | 
 610 | **Salient Object Subitizing**
 611 | 
 612 | ![](http://cs-people.bu.edu/jmzhang/images/frontpage.png?crc=123070793)
 613 | 
 614 | *   intro: CVPR 2015
 615 | *   intro: predicting the existence and the number of salient objects in an image using holistic cues
 616 | *   project page: [http://cs-people.bu.edu/jmzhang/sos.html](http://cs-people.bu.edu/jmzhang/sos.html)
 617 | *   arxiv: [http://arxiv.org/abs/1607.07525](http://arxiv.org/abs/1607.07525)
 618 | *   paper: [http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf](http://cs-people.bu.edu/jmzhang/SOS/SOS_preprint.pdf)
 619 | *   caffe model zoo: [https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing](https://github.com/BVLC/caffe/wiki/Model-Zoo#cnn-models-for-salient-object-subitizing)
 620 | 
 621 | **Deeply-Supervised Recurrent Convolutional Neural Network for Saliency Detection**
 622 | 
 623 | *   intro: ACMMM 2016\. deeply-supervised recurrent convolutional neural network (DSRCNN)
 624 | *   arxiv: [http://arxiv.org/abs/1608.05177](http://arxiv.org/abs/1608.05177)
 625 | 
 626 | **Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs**
 627 | 
 628 | *   intro: ECCV 2016
 629 | *   arxiv: [http://arxiv.org/abs/1608.05186](http://arxiv.org/abs/1608.05186)
 630 | 
 631 | **Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection**
 632 | 
 633 | *   arxiv: [http://arxiv.org/abs/1608.08029](http://arxiv.org/abs/1608.08029)
 634 | 
 635 | **A Deep Multi-Level Network for Saliency Prediction**
 636 | 
 637 | *   arxiv: [http://arxiv.org/abs/1609.01064](http://arxiv.org/abs/1609.01064)
 638 | 
 639 | **Visual Saliency Detection Based on Multiscale Deep CNN Features**
 640 | 
 641 | *   intro: IEEE Transactions on Image Processing
 642 | *   arxiv: [http://arxiv.org/abs/1609.02077](http://arxiv.org/abs/1609.02077)
 643 | 
 644 | **A Deep Spatial Contextual Long-term Recurrent Convolutional Network for Saliency Detection**
 645 | 
 646 | *   intro: DSCLRCN
 647 | *   arxiv: [https://arxiv.org/abs/1610.01708](https://arxiv.org/abs/1610.01708)
 648 | 
 649 | **Deeply supervised salient object detection with short connections**
 650 | 
 651 | *   arxiv: [https://arxiv.org/abs/1611.04849](https://arxiv.org/abs/1611.04849)
 652 | 
 653 | **Weakly Supervised Top-down Salient Object Detection**
 654 | 
 655 | *   intro: Nanyang Technological University
 656 | *   arxiv: [https://arxiv.org/abs/1611.05345](https://arxiv.org/abs/1611.05345)
 657 | 
 658 | **SalGAN: Visual Saliency Prediction with Generative Adversarial Networks**
 659 | 
 660 | *   project page: [https://imatge-upc.github.io/saliency-salgan-2017/](https://imatge-upc.github.io/saliency-salgan-2017/)
 661 | *   arxiv: [https://arxiv.org/abs/1701.01081](https://arxiv.org/abs/1701.01081)
 662 | 
 663 | **Visual Saliency Prediction Using a Mixture of Deep Neural Networks**
 664 | 
 665 | *   arxiv: [https://arxiv.org/abs/1702.00372](https://arxiv.org/abs/1702.00372)
 666 | 
 667 | **A Fast and Compact Salient Score Regression Network Based on Fully Convolutional Network**
 668 | 
 669 | *   arxiv: [https://arxiv.org/abs/1702.00615](https://arxiv.org/abs/1702.00615)
 670 | 
 671 | ## Saliency Detection in Video
 672 | 
 673 | **Deep Learning For Video Saliency Detection**
 674 | 
 675 | *   arxiv: [https://arxiv.org/abs/1702.00871](https://arxiv.org/abs/1702.00871)
 676 | 
 677 | ## Datasets
 678 | 
 679 | **MSRA10K Salient Object Database**
 680 | 
 681 | [http://mmcheng.net/msra10k/](http://mmcheng.net/msra10k/)
 682 | 
 683 | # Specific Object Deteciton
 684 | 
 685 | ## Face Deteciton
 686 | 
 687 | **Multi-view Face Detection Using Deep Convolutional Neural Networks**
 688 | 
 689 | *   intro: Yahoo
 690 | *   arxiv: [http://arxiv.org/abs/1502.02766](http://arxiv.org/abs/1502.02766)
 691 | 
 692 | **From Facial Parts Responses to Face Detection: A Deep Learning Approach**
 693 | 
 694 | ![](http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/support/index.png)
 695 | 
 696 | *   project page: [http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html](http://personal.ie.cuhk.edu.hk/~ys014/projects/Faceness/Faceness.html)
 697 | 
 698 | **Compact Convolutional Neural Network Cascade for Face Detection**
 699 | 
 700 | *   arxiv: [http://arxiv.org/abs/1508.01292](http://arxiv.org/abs/1508.01292)
 701 | *   github: [https://github.com/Bkmz21/FD-Evaluation](https://github.com/Bkmz21/FD-Evaluation)
 702 | 
 703 | **Face Detection with End-to-End Integration of a ConvNet and a 3D Model**
 704 | 
 705 | *   intro: ECCV 2016
 706 | *   arxiv: [https://arxiv.org/abs/1606.00850](https://arxiv.org/abs/1606.00850)
 707 | *   github(MXNet): [https://github.com/tfwu/FaceDetection-ConvNet-3D](https://github.com/tfwu/FaceDetection-ConvNet-3D)
 708 | 
 709 | **CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection**
 710 | 
 711 | *   intro: CMU
 712 | *   arxiv: [https://arxiv.org/abs/1606.05413](https://arxiv.org/abs/1606.05413)
 713 | 
 714 | **Finding Tiny Faces**
 715 | 
 716 | *   intro: CMU
 717 | *   arxiv: [https://arxiv.org/abs/1612.04402](https://arxiv.org/abs/1612.04402)
 718 | 
 719 | **Towards a Deep Learning Framework for Unconstrained Face Detection**
 720 | 
 721 | *   intro: overlap with CMS-RCNN
 722 | *   arxiv: [https://arxiv.org/abs/1612.05322](https://arxiv.org/abs/1612.05322)
 723 | 
 724 | **Supervised Transformer Network for Efficient Face Detection**
 725 | 
 726 | *   arxiv: [http://arxiv.org/abs/1607.05477](http://arxiv.org/abs/1607.05477)
 727 | 
 728 | ### UnitBox
 729 | 
 730 | **UnitBox: An Advanced Object Detection Network**
 731 | 
 732 | *   intro: ACM MM 2016
 733 | *   arxiv: [http://arxiv.org/abs/1608.01471](http://arxiv.org/abs/1608.01471)
 734 | 
 735 | **Bootstrapping Face Detection with Hard Negative Examples**
 736 | 
 737 | *   author: 万韶华 @ 小米.
 738 | *   intro: Faster R-CNN, hard negative mining. state-of-the-art on the FDDB dataset
 739 | *   arxiv: [http://arxiv.org/abs/1608.02236](http://arxiv.org/abs/1608.02236)
 740 | 
 741 | **Grid Loss: Detecting Occluded Faces**
 742 | 
 743 | *   intro: ECCV 2016
 744 | *   arxiv: [https://arxiv.org/abs/1609.00129](https://arxiv.org/abs/1609.00129)
 745 | *   paper: [http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf](http://lrs.icg.tugraz.at/pubs/opitz_eccv_16.pdf)
 746 | *   poster: [http://www.eccv2016.org/files/posters/P-2A-34.pdf](http://www.eccv2016.org/files/posters/P-2A-34.pdf)
 747 | 
 748 | **A Multi-Scale Cascade Fully Convolutional Network Face Detector**
 749 | 
 750 | *   intro: ICPR 2016
 751 | *   arxiv: [http://arxiv.org/abs/1609.03536](http://arxiv.org/abs/1609.03536)
 752 | 
 753 | ### MTCNN
 754 | 
 755 | **Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks**
 756 | 
 757 | **Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks**
 758 | 
 759 | ![](https://kpzhang93.github.io/MTCNN_face_detection_alignment/support/index.png)
 760 | 
 761 | *   project page: [https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html](https://kpzhang93.github.io/MTCNN_face_detection_alignment/index.html)
 762 | *   arxiv: [https://arxiv.org/abs/1604.02878](https://arxiv.org/abs/1604.02878)
 763 | *   github(Matlab): [https://github.com/kpzhang93/MTCNN_face_detection_alignment](https://github.com/kpzhang93/MTCNN_face_detection_alignment)
 764 | *   github(MXNet): [https://github.com/pangyupo/mxnet_mtcnn_face_detection](https://github.com/pangyupo/mxnet_mtcnn_face_detection)
 765 | *   github: [https://github.com/DaFuCoding/MTCNN_Caffe](https://github.com/DaFuCoding/MTCNN_Caffe)
 766 | *   github(MXNet): [https://github.com/Seanlinx/mtcnn](https://github.com/Seanlinx/mtcnn)
 767 | 
 768 | **Face Detection using Deep Learning: An Improved Faster RCNN Approach**
 769 | 
 770 | *   intro: DeepIR Inc
 771 | *   arxiv: [https://arxiv.org/abs/1701.08289](https://arxiv.org/abs/1701.08289)
 772 | 
 773 | **Faceness-Net: Face Detection through Deep Facial Part Responses**
 774 | 
 775 | *   intro: An extended version of ICCV 2015 paper
 776 | *   arxiv: [https://arxiv.org/abs/1701.08393](https://arxiv.org/abs/1701.08393)
 777 | 
 778 | ### Datasets / Benchmarks
 779 | 
 780 | **FDDB: Face Detection Data Set and Benchmark**
 781 | 
 782 | *   homepage: [http://vis-www.cs.umass.edu/fddb/index.html](http://vis-www.cs.umass.edu/fddb/index.html)
 783 | *   results: [http://vis-www.cs.umass.edu/fddb/results.html](http://vis-www.cs.umass.edu/fddb/results.html)
 784 | 
 785 | **WIDER FACE: A Face Detection Benchmark**
 786 | 
 787 | ![](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/intro.jpg)
 788 | 
 789 | *   homepage: [http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/)
 790 | *   arxiv: [http://arxiv.org/abs/1511.06523](http://arxiv.org/abs/1511.06523)
 791 | 
 792 | ## Facial Point / Landmark Detection
 793 | 
 794 | **Deep Convolutional Network Cascade for Facial Point Detection**
 795 | 
 796 | ![](http://mmlab.ie.cuhk.edu.hk/archive/CNN/data/Picture1.png)
 797 | 
 798 | *   homepage: [http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm](http://mmlab.ie.cuhk.edu.hk/archive/CNN_FacePoint.htm)
 799 | *   paper: [http://www.ee.cuhk.edu.hk/~xgwang/papers/sunWTcvpr13.pdf](http://www.ee.cuhk.edu.hk/~xgwang/papers/sunWTcvpr13.pdf)
 800 | *   github: [https://github.com/luoyetx/deep-landmark](https://github.com/luoyetx/deep-landmark)
 801 | 
 802 | **Facial Landmark Detection by Deep Multi-task Learning**
 803 | 
 804 | *   intro: ECCV 2014
 805 | *   project page: [http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html](http://mmlab.ie.cuhk.edu.hk/projects/TCDCN.html)
 806 | *   paper: [http://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2014_deepfacealign.pdf](http://personal.ie.cuhk.edu.hk/~ccloy/files/eccv_2014_deepfacealign.pdf)
 807 | *   github(Matlab): [https://github.com/zhzhanp/TCDCN-face-alignment](https://github.com/zhzhanp/TCDCN-face-alignment)
 808 | 
 809 | **A Recurrent Encoder-Decoder Network for Sequential Face Alignment**
 810 | 
 811 | *   intro: ECCV 2016
 812 | *   arxiv: [https://arxiv.org/abs/1608.05477](https://arxiv.org/abs/1608.05477)
 813 | 
 814 | **Detecting facial landmarks in the video based on a hybrid framework**
 815 | 
 816 | *   arxiv: [http://arxiv.org/abs/1609.06441](http://arxiv.org/abs/1609.06441)
 817 | 
 818 | **Deep Constrained Local Models for Facial Landmark Detection**
 819 | 
 820 | *   arxiv: [https://arxiv.org/abs/1611.08657](https://arxiv.org/abs/1611.08657)
 821 | 
 822 | **Effective face landmark localization via single deep network**
 823 | 
 824 | *   arxiv: [https://arxiv.org/abs/1702.02719](https://arxiv.org/abs/1702.02719)
 825 | 
 826 | ## People Detection
 827 | 
 828 | **End-to-end people detection in crowded scenes**
 829 | 
 830 | ![](end_to_end_people_detection_in_crowded_scenes.jpg)
 831 | 
 832 | *   arxiv: [http://arxiv.org/abs/1506.04878](http://arxiv.org/abs/1506.04878)
 833 | *   github: [https://github.com/Russell91/reinspect](https://github.com/Russell91/reinspect)
 834 | *   ipn: [http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb](http://nbviewer.ipython.org/github/Russell91/ReInspect/blob/master/evaluation_reinspect.ipynb)
 835 | 
 836 | **Detecting People in Artwork with CNNs**
 837 | 
 838 | *   intro: ECCV 2016 Workshops
 839 | *   arxiv: [https://arxiv.org/abs/1610.08871](https://arxiv.org/abs/1610.08871)
 840 | 
 841 | ## Person Head Detection
 842 | 
 843 | **Context-aware CNNs for person head detection**
 844 | 
 845 | *   arxiv: [http://arxiv.org/abs/1511.07917](http://arxiv.org/abs/1511.07917)
 846 | *   github: [https://github.com/aosokin/cnn_head_detection](https://github.com/aosokin/cnn_head_detection)
 847 | 
 848 | ## Pedestrian Detection
 849 | 
 850 | **Pedestrian Detection aided by Deep Learning Semantic Tasks**
 851 | 
 852 | *   intro: CVPR 2015
 853 | *   project page: [http://mmlab.ie.cuhk.edu.hk/projects/TA-CNN/](http://mmlab.ie.cuhk.edu.hk/projects/TA-CNN/)
 854 | *   paper: [http://arxiv.org/abs/1412.0069](http://arxiv.org/abs/1412.0069)
 855 | 
 856 | **Deep Learning Strong Parts for Pedestrian Detection**
 857 | 
 858 | *   intro: ICCV 2015\. CUHK. DeepParts
 859 | *   intro: Achieving 11.89% average miss rate on Caltech Pedestrian Dataset
 860 | *   paper: [http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf](http://personal.ie.cuhk.edu.hk/~pluo/pdf/tianLWTiccv15.pdf)
 861 | 
 862 | **Deep convolutional neural networks for pedestrian detection**
 863 | 
 864 | *   arxiv: [http://arxiv.org/abs/1510.03608](http://arxiv.org/abs/1510.03608)
 865 | *   github: [https://github.com/DenisTome/DeepPed](https://github.com/DenisTome/DeepPed)
 866 | 
 867 | **Scale-aware Fast R-CNN for Pedestrian Detection**
 868 | 
 869 | *   arxiv: [https://arxiv.org/abs/1510.08160](https://arxiv.org/abs/1510.08160)
 870 | 
 871 | **New algorithm improves speed and accuracy of pedestrian detection**
 872 | 
 873 | *   blog: [http://www.eurekalert.org/pub_releases/2016-02/uoc–nai020516.php](http://www.eurekalert.org/pub_releases/2016-02/uoc--nai020516.php)
 874 | 
 875 | **Pushing the Limits of Deep CNNs for Pedestrian Detection**
 876 | 
 877 | *   intro: “set a new record on the Caltech pedestrian dataset, lowering the log-average miss rate from 11.7% to 8.9%”
 878 | *   arxiv: [http://arxiv.org/abs/1603.04525](http://arxiv.org/abs/1603.04525)
 879 | 
 880 | **A Real-Time Deep Learning Pedestrian Detector for Robot Navigation**
 881 | 
 882 | *   arxiv: [http://arxiv.org/abs/1607.04436](http://arxiv.org/abs/1607.04436)
 883 | 
 884 | **A Real-Time Pedestrian Detector using Deep Learning for Human-Aware Navigation**
 885 | 
 886 | *   arxiv: [http://arxiv.org/abs/1607.04441](http://arxiv.org/abs/1607.04441)
 887 | 
 888 | **Is Faster R-CNN Doing Well for Pedestrian Detection?**
 889 | 
 890 | *   intro: ECCV 2016
 891 | *   arxiv: [http://arxiv.org/abs/1607.07032](http://arxiv.org/abs/1607.07032)
 892 | *   github: [https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian](https://github.com/zhangliliang/RPN_BF/tree/RPN-pedestrian)
 893 | 
 894 | **Reduced Memory Region Based Deep Convolutional Neural Network Detection**
 895 | 
 896 | *   intro: IEEE 2016 ICCE-Berlin
 897 | *   arxiv: [http://arxiv.org/abs/1609.02500](http://arxiv.org/abs/1609.02500)
 898 | 
 899 | **Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection**
 900 | 
 901 | *   arxiv: [https://arxiv.org/abs/1610.03466](https://arxiv.org/abs/1610.03466)
 902 | 
 903 | **Multispectral Deep Neural Networks for Pedestrian Detection**
 904 | 
 905 | *   intro: BMVC 2016 oral
 906 | *   arxiv: [https://arxiv.org/abs/1611.02644](https://arxiv.org/abs/1611.02644)
 907 | 
 908 | ## Vehicle Detection
 909 | 
 910 | **DAVE: A Unified Framework for Fast Vehicle Detection and Annotation**
 911 | 
 912 | *   intro: ECCV 2016
 913 | *   arxiv: [http://arxiv.org/abs/1607.04564](http://arxiv.org/abs/1607.04564)
 914 | 
 915 | **Evolving Boxes for fast Vehicle Detection**
 916 | 
 917 | *   arxiv: [https://arxiv.org/abs/1702.00254](https://arxiv.org/abs/1702.00254)
 918 | 
 919 | ## Traffic-Sign Detection
 920 | 
 921 | **Traffic-Sign Detection and Classification in the Wild**
 922 | 
 923 | *   project page(code+dataset): [http://cg.cs.tsinghua.edu.cn/traffic-sign/](http://cg.cs.tsinghua.edu.cn/traffic-sign/)
 924 | *   paper: [http://120.52.73.11/www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf](http://120.52.73.11/www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Zhu_Traffic-Sign_Detection_and_CVPR_2016_paper.pdf)
 925 | *   code & model: [http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip](http://cg.cs.tsinghua.edu.cn/traffic-sign/data_model_code/newdata0411.zip)
 926 | 
 927 | ## Boundary / Edge / Contour Detection
 928 | 
 929 | **Holistically-Nested Edge Detection**
 930 | 
 931 | ![](https://camo.githubusercontent.com/da32e7e3275c2a9693dd2a6925b03a1151e2b098/687474703a2f2f70616765732e756373642e6564752f7e7a74752f6865642e6a7067)
 932 | 
 933 | *   intro: ICCV 2015, Marr Prize
 934 | *   paper: [http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf](http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Xie_Holistically-Nested_Edge_Detection_ICCV_2015_paper.pdf)
 935 | *   arxiv: [http://arxiv.org/abs/1504.06375](http://arxiv.org/abs/1504.06375)
 936 | *   github: [https://github.com/s9xie/hed](https://github.com/s9xie/hed)
 937 | 
 938 | **Unsupervised Learning of Edges**
 939 | 
 940 | *   intro: CVPR 2016\. Facebook AI Research
 941 | *   arxiv: [http://arxiv.org/abs/1511.04166](http://arxiv.org/abs/1511.04166)
 942 | *   zn-blog: [http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html](http://www.leiphone.com/news/201607/b1trsg9j6GSMnjOP.html)
 943 | 
 944 | **Pushing the Boundaries of Boundary Detection using Deep Learning**
 945 | 
 946 | *   arxiv: [http://arxiv.org/abs/1511.07386](http://arxiv.org/abs/1511.07386)
 947 | 
 948 | **Convolutional Oriented Boundaries**
 949 | 
 950 | *   intro: ECCV 2016
 951 | *   arxiv: [http://arxiv.org/abs/1608.02755](http://arxiv.org/abs/1608.02755)
 952 | 
 953 | **Convolutional Oriented Boundaries: From Image Segmentation to High-Level Tasks**
 954 | 
 955 | *   project page: [http://www.vision.ee.ethz.ch/~cvlsegmentation/](http://www.vision.ee.ethz.ch/~cvlsegmentation/)
 956 | *   arxiv: [https://arxiv.org/abs/1701.04658](https://arxiv.org/abs/1701.04658)
 957 | 
 958 | **Richer Convolutional Features for Edge Detection**
 959 | 
 960 | *   intro: richer convolutional features (RCF)
 961 | *   arxiv: [https://arxiv.org/abs/1612.02103](https://arxiv.org/abs/1612.02103)
 962 | 
 963 | ## Skeleton Detection
 964 | 
 965 | **Object Skeleton Extraction in Natural Images by Fusing Scale-associated Deep Side Outputs**
 966 | 
 967 | ![](https://camo.githubusercontent.com/88a65f132aa4ae4b0477e3ad02c13cdc498377d9/687474703a2f2f37786e37777a2e636f6d312e7a302e676c622e636c6f7564646e2e636f6d2f44656570536b656c65746f6e2e706e673f696d61676556696577322f322f772f353030)
 968 | 
 969 | *   arxiv: [http://arxiv.org/abs/1603.09446](http://arxiv.org/abs/1603.09446)
 970 | *   github: [https://github.com/zeakey/DeepSkeleton](https://github.com/zeakey/DeepSkeleton)
 971 | 
 972 | **DeepSkeleton: Learning Multi-task Scale-associated Deep Side Outputs for Object Skeleton Extraction in Natural Images**
 973 | 
 974 | *   arxiv: [http://arxiv.org/abs/1609.03659](http://arxiv.org/abs/1609.03659)
 975 | 
 976 | ## Fruit Detection
 977 | 
 978 | **Deep Fruit Detection in Orchards**
 979 | 
 980 | *   arxiv: [https://arxiv.org/abs/1610.03677](https://arxiv.org/abs/1610.03677)
 981 | 
 982 | **Image Segmentation for Fruit Detection and Yield Estimation in Apple Orchards**
 983 | 
 984 | *   intro: The Journal of Field Robotics in May 2016
 985 | *   project page: [http://confluence.acfr.usyd.edu.au/display/AGPub/](http://confluence.acfr.usyd.edu.au/display/AGPub/)
 986 | *   arxiv: [https://arxiv.org/abs/1610.08120](https://arxiv.org/abs/1610.08120)
 987 | 
 988 | ## Others
 989 | 
 990 | **Deep Deformation Network for Object Landmark Localization**
 991 | 
 992 | *   arxiv: [http://arxiv.org/abs/1605.01014](http://arxiv.org/abs/1605.01014)
 993 | 
 994 | **Fashion Landmark Detection in the Wild**
 995 | 
 996 | *   arxiv: [http://arxiv.org/abs/1608.03049](http://arxiv.org/abs/1608.03049)
 997 | 
 998 | **Deep Learning for Fast and Accurate Fashion Item Detection**
 999 | 
1000 | *   intro: Kuznech Inc.
1001 | *   intro: MultiBox and Fast R-CNN
1002 | *   paper: [https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf](https://kddfashion2016.mybluemix.net/kddfashion_finalSubmissions/Deep%20Learning%20for%20Fast%20and%20Accurate%20Fashion%20Item%20Detection.pdf)
1003 | 
1004 | **Visual Relationship Detection with Language Priors**
1005 | 
1006 | *   intro: ECCV 2016 oral
1007 | *   paper: [https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf](https://cs.stanford.edu/people/ranjaykrishna/vrd/vrd.pdf)
1008 | *   github: [https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection](https://github.com/Prof-Lu-Cewu/Visual-Relationship-Detection)
1009 | 
1010 | **OSMDeepOD - OSM and Deep Learning based Object Detection from Aerial Imagery (formerly known as “OSM-Crosswalk-Detection”)**
1011 | 
1012 | ![](https://raw.githubusercontent.com/geometalab/OSMDeepOD/master/imgs/process.png)
1013 | 
1014 | *   github: [https://github.com/geometalab/OSMDeepOD](https://github.com/geometalab/OSMDeepOD)
1015 | 
1016 | **Selfie Detection by Synergy-Constraint Based Convolutional Neural Network**
1017 | 
1018 | *   intro: IEEE SITIS 2016
1019 | *   arxiv: [https://arxiv.org/abs/1611.04357](https://arxiv.org/abs/1611.04357)
1020 | 
1021 | **Associative Embedding:End-to-End Learning for Joint Detection and Grouping**
1022 | 
1023 | *   arxiv: [https://arxiv.org/abs/1611.05424](https://arxiv.org/abs/1611.05424)
1024 | 
1025 | **Deep Cuboid Detection: Beyond 2D Bounding Boxes**
1026 | 
1027 | *   intro: CMU & Magic Leap
1028 | *   arxiv: [https://arxiv.org/abs/1611.10010](https://arxiv.org/abs/1611.10010)
1029 | 
1030 | **Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection**
1031 | 
1032 | *   arxiv: [https://arxiv.org/abs/1612.03019](https://arxiv.org/abs/1612.03019)
1033 | 
1034 | **Deep Learning Logo Detection with Data Expansion by Synthesising Context**
1035 | 
1036 | *   arxiv: [https://arxiv.org/abs/1612.09322](https://arxiv.org/abs/1612.09322)
1037 | 
1038 | **Pixel-wise Ear Detection with Convolutional Encoder-Decoder Networks**
1039 | 
1040 | *   arxiv: [https://arxiv.org/abs/1702.00307](https://arxiv.org/abs/1702.00307)
1041 | 
1042 | # Object Proposal
1043 | 
1044 | **DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers**
1045 | 
1046 | *   arxiv: [http://arxiv.org/abs/1510.04445](http://arxiv.org/abs/1510.04445)
1047 | *   github: [https://github.com/aghodrati/deepproposal](https://github.com/aghodrati/deepproposal)
1048 | 
1049 | **Scale-aware Pixel-wise Object Proposal Networks**
1050 | 
1051 | *   intro: IEEE Transactions on Image Processing
1052 | *   arxiv: [http://arxiv.org/abs/1601.04798](http://arxiv.org/abs/1601.04798)
1053 | 
1054 | **Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization**
1055 | 
1056 | *   intro: BMVC 2016\. AttractioNet
1057 | *   arxiv: [https://arxiv.org/abs/1606.04446](https://arxiv.org/abs/1606.04446)
1058 | *   github: [https://github.com/gidariss/AttractioNet](https://github.com/gidariss/AttractioNet)
1059 | 
1060 | **Learning to Segment Object Proposals via Recursive Neural Networks**
1061 | 
1062 | *   arxiv: [https://arxiv.org/abs/1612.01057](https://arxiv.org/abs/1612.01057)
1063 | 
1064 | # Localization
1065 | 
1066 | **Beyond Bounding Boxes: Precise Localization of Objects in Images**
1067 | 
1068 | *   intro: PhD Thesis
1069 | *   homepage: [http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html](http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.html)
1070 | *   phd-thesis: [http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf](http://www.eecs.berkeley.edu/Pubs/TechRpts/2015/EECS-2015-193.pdf)
1071 | *   github(“SDS using hypercolumns”): [https://github.com/bharath272/sds](https://github.com/bharath272/sds)
1072 | 
1073 | **Weakly Supervised Object Localization with Multi-fold Multiple Instance Learning**
1074 | 
1075 | *   arxiv: [http://arxiv.org/abs/1503.00949](http://arxiv.org/abs/1503.00949)
1076 | 
1077 | **Weakly Supervised Object Localization Using Size Estimates**
1078 | 
1079 | *   arxiv: [http://arxiv.org/abs/1608.04314](http://arxiv.org/abs/1608.04314)
1080 | 
1081 | **Active Object Localization with Deep Reinforcement Learning**
1082 | 
1083 | *   intro: ICCV 2015
1084 | *   keywords: Markov Decision Process
1085 | *   arxiv: [https://arxiv.org/abs/1511.06015](https://arxiv.org/abs/1511.06015)
1086 | 
1087 | **Localizing objects using referring expressions**
1088 | 
1089 | *   intro: ECCV 2016
1090 | *   keywords: LSTM, multiple instance learning (MIL)
1091 | *   paper: [http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf](http://www.umiacs.umd.edu/~varun/files/refexp-ECCV16.pdf)
1092 | *   github: [https://github.com/varun-nagaraja/referring-expressions](https://github.com/varun-nagaraja/referring-expressions)
1093 | 
1094 | **LocNet: Improving Localization Accuracy for Object Detection**
1095 | 
1096 | *   arxiv: [http://arxiv.org/abs/1511.07763](http://arxiv.org/abs/1511.07763)
1097 | *   github: [https://github.com/gidariss/LocNet](https://github.com/gidariss/LocNet)
1098 | 
1099 | **Learning Deep Features for Discriminative Localization**
1100 | 
1101 | ![](http://cnnlocalization.csail.mit.edu/framework.jpg)
1102 | 
1103 | *   homepage: [http://cnnlocalization.csail.mit.edu/](http://cnnlocalization.csail.mit.edu/)
1104 | *   arxiv: [http://arxiv.org/abs/1512.04150](http://arxiv.org/abs/1512.04150)
1105 | *   github(Tensorflow): [https://github.com/jazzsaxmafia/Weakly_detector](https://github.com/jazzsaxmafia/Weakly_detector)
1106 | *   github: [https://github.com/metalbubble/CAM](https://github.com/metalbubble/CAM)
1107 | *   github: [https://github.com/tdeboissiere/VGG16CAM-keras](https://github.com/tdeboissiere/VGG16CAM-keras)
1108 | 
1109 | **ContextLocNet: Context-Aware Deep Network Models for Weakly Supervised Localization**
1110 | 
1111 | ![](http://www.di.ens.fr/willow/research/contextlocnet/model.png)
1112 | 
1113 | *   intro: ECCV 2016
1114 | *   project page: [http://www.di.ens.fr/willow/research/contextlocnet/](http://www.di.ens.fr/willow/research/contextlocnet/)
1115 | *   arxiv: [http://arxiv.org/abs/1609.04331](http://arxiv.org/abs/1609.04331)
1116 | *   github: [https://github.com/vadimkantorov/contextlocnet](https://github.com/vadimkantorov/contextlocnet)
1117 | 
1118 | # Tutorials / Talks
1119 | 
1120 | **Convolutional Feature Maps: Elements of efficient (and accurate) CNN-based object detection**
1121 | 
1122 | *   slides: [http://research.microsoft.com/en-us/um/people/kahe/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf](http://research.microsoft.com/en-us/um/people/kahe/iccv15tutorial/iccv2015_tutorial_convolutional_feature_maps_kaiminghe.pdf)
1123 | 
1124 | **Towards Good Practices for Recognition & Detection**
1125 | 
1126 | *   intro: Hikvision Research Institute. Supervised Data Augmentation (SDA)
1127 | *   slides: [http://image-net.org/challenges/talks/2016/Hikvision_at_ImageNet_2016.pdf](http://image-net.org/challenges/talks/2016/Hikvision_at_ImageNet_2016.pdf)
1128 | 
1129 | # Projects
1130 | 
1131 | **TensorBox: a simple framework for training neural networks to detect objects in images**
1132 | 
1133 | *   intro: “The basic model implements the simple and robust GoogLeNet-OverFeat algorithm. We additionally provide an implementation of the [ReInspect](https://github.com/Russell91/ReInspect/) algorithm”
1134 | *   github: [https://github.com/Russell91/TensorBox](https://github.com/Russell91/TensorBox)
1135 | 
1136 | **Object detection in torch: Implementation of some object detection frameworks in torch**
1137 | 
1138 | *   github: [https://github.com/fmassa/object-detection.torch](https://github.com/fmassa/object-detection.torch)
1139 | 
1140 | **Using DIGITS to train an Object Detection network**
1141 | 
1142 | *   github: [https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md](https://github.com/NVIDIA/DIGITS/blob/master/examples/object-detection/README.md)
1143 | 
1144 | **FCN-MultiBox Detector**
1145 | 
1146 | *   intro: Full convolution MultiBox Detector (like SSD) implemented in Torch.
1147 | *   github: [https://github.com/teaonly/FMD.torch](https://github.com/teaonly/FMD.torch)
1148 | 
1149 | **KittiBox: A car detection model implemented in Tensorflow.**
1150 | 
1151 | *   keywords: MultiNet
1152 | *   intro: KittiBox is a collection of scripts to train out model FastBox on the Kitti Object Detection Dataset
1153 | *   github: [https://github.com/MarvinTeichmann/KittiBox](https://github.com/MarvinTeichmann/KittiBox)
1154 | 
1155 | # Blogs
1156 | 
1157 | **Convolutional Neural Networks for Object Detection**
1158 | 
1159 | [http://rnd.azoft.com/convolutional-neural-networks-object-detection/](http://rnd.azoft.com/convolutional-neural-networks-object-detection/)
1160 | 
1161 | **Introducing automatic object detection to visual search (Pinterest)**
1162 | 
1163 | *   keywords: Faster R-CNN
1164 | *   blog: [https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search](https://engineering.pinterest.com/blog/introducing-automatic-object-detection-visual-search)
1165 | *   demo: [https://engineering.pinterest.com/sites/engineering/files/Visual%20Search%20V1%20-%20Video.mp4](https://engineering.pinterest.com/sites/engineering/files/Visual%20Search%20V1%20-%20Video.mp4)
1166 | *   review: [https://news.developer.nvidia.com/pinterest-introduces-the-future-of-visual-search/?mkt_tok=eyJpIjoiTnpaa01UWXpPRE0xTURFMiIsInQiOiJJRjcybjkwTmtmallORUhLOFFFODBDclFqUlB3SWlRVXJXb1MrQ013TDRIMGxLQWlBczFIeWg0TFRUdnN2UHY2ZWFiXC9QQVwvQzBHM3B0UzBZblpOSmUyU1FcLzNPWXI4cml2VERwTTJsOFwvOEk9In0%3D](https://news.developer.nvidia.com/pinterest-introduces-the-future-of-visual-search/?mkt_tok=eyJpIjoiTnpaa01UWXpPRE0xTURFMiIsInQiOiJJRjcybjkwTmtmallORUhLOFFFODBDclFqUlB3SWlRVXJXb1MrQ013TDRIMGxLQWlBczFIeWg0TFRUdnN2UHY2ZWFiXC9QQVwvQzBHM3B0UzBZblpOSmUyU1FcLzNPWXI4cml2VERwTTJsOFwvOEk9In0%3D)
1167 | 
1168 | **Deep Learning for Object Detection with DIGITS**
1169 | 
1170 | *   blog: [https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/](https://devblogs.nvidia.com/parallelforall/deep-learning-object-detection-digits/)
1171 | 
1172 | **Analyzing The Papers Behind Facebook’s Computer Vision Approach**
1173 | 
1174 | *   keywords: DeepMask, SharpMask, MultiPathNet
1175 | *   blog: [https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook’s-Computer-Vision-Approach/](https://adeshpande3.github.io/adeshpande3.github.io/Analyzing-the-Papers-Behind-Facebook's-Computer-Vision-Approach/)
1176 | 
1177 | **Easily Create High Quality Object Detectors with Deep Learning**
1178 | 
1179 | *   intro: dlib v19.2
1180 | *   blog: [http://blog.dlib.net/2016/10/easily-create-high-quality-object.html](http://blog.dlib.net/2016/10/easily-create-high-quality-object.html)
1181 | 
1182 | **How to Train a Deep-Learned Object Detection Model in the Microsoft Cognitive Toolkit**
1183 | 
1184 | *   blog: [https://blogs.technet.microsoft.com/machinelearning/2016/10/25/how-to-train-a-deep-learned-object-detection-model-in-cntk/](https://blogs.technet.microsoft.com/machinelearning/2016/10/25/how-to-train-a-deep-learned-object-detection-model-in-cntk/)
1185 | *   github: [https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN](https://github.com/Microsoft/CNTK/tree/master/Examples/Image/Detection/FastRCNN)
1186 | 
1187 | **Object Detection in Satellite Imagery, a Low Overhead Approach**
1188 | 
1189 | *   part 1: [https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9](https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-i-cbd96154a1b7#.2csh4iwx9)
1190 | *   part 2: [https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64](https://medium.com/the-downlinq/object-detection-in-satellite-imagery-a-low-overhead-approach-part-ii-893f40122f92#.f9b7dgf64)
1191 | 
1192 | **You Only Look Twice — Multi-Scale Object Detection in Satellite Imagery With Convolutional Neural Networks**
1193 | 
1194 | *   part 1: [https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of](https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-38dad1cf7571#.fmmi2o3of)
1195 | *   part 2: [https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t](https://medium.com/the-downlinq/you-only-look-twice-multi-scale-object-detection-in-satellite-imagery-with-convolutional-neural-34f72f659588#.nwzarsz1t)
1196 | 
1197 | **Faster R-CNN Pedestrian and Car Detection**
1198 | 
1199 | *   blog: [https://bigsnarf.wordpress.com/2016/11/07/faster-r-cnn-pedestrian-and-car-detection/](https://bigsnarf.wordpress.com/2016/11/07/faster-r-cnn-pedestrian-and-car-detection/)
1200 | *   ipn: [https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb](https://gist.github.com/bigsnarfdude/2f7b2144065f6056892a98495644d3e0#file-demo_faster_rcnn_notebook-ipynb)
1201 | *   github: [https://github.com/bigsnarfdude/Faster-RCNN_TF](https://github.com/bigsnarfdude/Faster-RCNN_TF)
1202 | 
1203 | **Small U-Net for vehicle detection**
1204 | 
1205 | *   blog: [https://medium.com/@vivek.yadav/small-u-net-for-vehicle-detection-9eec216f9fd6#.md4u80kad](https://medium.com/@vivek.yadav/small-u-net-for-vehicle-detection-9eec216f9fd6#.md4u80kad)
1206 | 


--------------------------------------------------------------------------------