A paper list of object detection using deep learning. I wrote this page with reference to this survey paper and searching and searching..
Last updated: 2020/09/22
2018/9/18 - update all of recent papers and make some diagram about history of object detection using deep learning.
2018/9/26 - update codes of papers. (official and unofficial)
2018/october - update 5 papers and performance table.
2018/november - update 9 papers.
2018/december - update 8 papers and and performance table and add new diagram(2019 version!!).
2019/january - update 4 papers and and add commonly used datasets.
2019/february - update 3 papers.
2019/march - update figure and code links.
2019/april - remove author's names and update ICLR 2019 & CVPR 2019 papers.
2019/may - update CVPR 2019 papers.
2019/june - update CVPR 2019 papers and dataset paper.
2019/july - update BMVC 2019 papers and some of ICCV 2019 papers.
2019/september - update NeurIPS 2019 papers and ICCV 2019 papers.
2019/november - update some of AAAI 2020 papers and other papers.
2020/january - update ICLR 2020 papers and other papers.
2020/may - update CVPR 2020 papers and other papers.
2020/june - update arxiv papers.
2020/august - update paper links.
The part highlighted with red characters means papers that i think "must-read". However, it is my personal opinion and other papers are important too, so I recommend to read them if you have time.
FPS(Speed) index is related to the hardware spec(e.g. CPU, GPU, RAM, etc), so it is hard to make an equal comparison. The solution is to measure the performance of all models on hardware with equivalent specifications, but it is very difficult and time consuming.
| Detector | VOC07 (mAP@IoU=0.5) | VOC12 (mAP@IoU=0.5) | COCO (mAP@IoU=0.5:0.95) | Published In |
|---|---|---|---|---|
| R-CNN | 58.5 | - | - | CVPR'14 |
| SPP-Net | 59.2 | - | - | ECCV'14 |
| MR-CNN | 78.2 (07+12) | 73.9 (07+12) | - | ICCV'15 |
| Fast R-CNN | 70.0 (07+12) | 68.4 (07++12) | 19.7 | ICCV'15 |
| Faster R-CNN | 73.2 (07+12) | 70.4 (07++12) | 21.9 | NIPS'15 |
| YOLO v1 | 66.4 (07+12) | 57.9 (07++12) | - | CVPR'16 |
| G-CNN | 66.8 | 66.4 (07+12) | - | CVPR'16 |
| AZNet | 70.4 | - | 22.3 | CVPR'16 |
| ION | 80.1 | 77.9 | 33.1 | CVPR'16 |
| HyperNet | 76.3 (07+12) | 71.4 (07++12) | - | CVPR'16 |
| OHEM | 78.9 (07+12) | 76.3 (07++12) | 22.4 | CVPR'16 |
| MPN | - | - | 33.2 | BMVC'16 |
| SSD | 76.8 (07+12) | 74.9 (07++12) | 31.2 | ECCV'16 |
| GBDNet | 77.2 (07+12) | - | 27.0 | ECCV'16 |
| CPF | 76.4 (07+12) | 72.6 (07++12) | - | ECCV'16 |
| R-FCN | 79.5 (07+12) | 77.6 (07++12) | 29.9 | NIPS'16 |
| DeepID-Net | 69.0 | - | - | PAMI'16 |
| NoC | 71.6 (07+12) | 68.8 (07+12) | 27.2 | TPAMI'16 |
| DSSD | 81.5 (07+12) | 80.0 (07++12) | 33.2 | arXiv'17 |
| TDM | - | - | 37.3 | CVPR'17 |
| FPN | - | - | 36.2 | CVPR'17 |
| YOLO v2 | 78.6 (07+12) | 73.4 (07++12) | - | CVPR'17 |
| RON | 77.6 (07+12) | 75.4 (07++12) | 27.4 | CVPR'17 |
| DeNet | 77.1 (07+12) | 73.9 (07++12) | 33.8 | ICCV'17 |
| CoupleNet | 82.7 (07+12) | 80.4 (07++12) | 34.4 | ICCV'17 |
| RetinaNet | - | - | 39.1 | ICCV'17 |
| DSOD | 77.7 (07+12) | 76.3 (07++12) | - | ICCV'17 |
| SMN | 70.0 | - | - | ICCV'17 |
| Light-Head R-CNN | - | - | 41.5 | arXiv'17 |
| YOLO v3 | - | - | 33.0 | arXiv'18 |
| SIN | 76.0 (07+12) | 73.1 (07++12) | 23.2 | CVPR'18 |
| STDN | 80.9 (07+12) | - | - | CVPR'18 |
| RefineDet | 83.8 (07+12) | 83.5 (07++12) | 41.8 | CVPR'18 |
| SNIP | - | - | 45.7 | CVPR'18 |
| Relation-Network | - | - | 32.5 | CVPR'18 |
| Cascade R-CNN | - | - | 42.8 | CVPR'18 |
| MLKP | 80.6 (07+12) | 77.2 (07++12) | 28.6 | CVPR'18 |
| Fitness-NMS | - | - | 41.8 | CVPR'18 |
| RFBNet | 82.2 (07+12) | - | - | ECCV'18 |
| CornerNet | - | - | 42.1 | ECCV'18 |
| PFPNet | 84.1 (07+12) | 83.7 (07++12) | 39.4 | ECCV'18 |
| Pelee | 70.9 (07+12) | - | - | NIPS'18 |
| HKRM | 78.8 (07+12) | - | 37.8 | NIPS'18 |
| M2Det | - | - | 44.2 | AAAI'19 |
| R-DAD | 81.2 (07++12) | 82.0 (07++12) | 43.1 | AAAI'19 |
| ScratchDet | 84.1 (07++12) | 83.6 (07++12) | 39.1 | CVPR'19 |
| Libra R-CNN | - | - | 43.0 | CVPR'19 |
| Reasoning-RCNN | 82.5 (07++12) | - | 43.2 | CVPR'19 |
| FSAF | - | - | 44.6 | CVPR'19 |
| AmoebaNet + NAS-FPN | - | - | 47.0 | CVPR'19 |
| Cascade-RetinaNet | - | - | 41.1 | CVPR'19 |
| HTC | - | - | 47.2 | CVPR'19 |
| TridentNet | - | - | 48.4 | ICCV'19 |
| DAFS | 85.3 (07+12) | 83.1 (07++12) | 40.5 | ICCV'19 |
| Auto-FPN | 81.8 (07++12) | - | 40.5 | ICCV'19 |
| FCOS | - | - | 44.7 | ICCV'19 |
| FreeAnchor | - | - | 44.8 | NeurIPS'19 |
| DetNAS | 81.5 (07++12) | - | 42.0 | NeurIPS'19 |
| NATS | - | - | 42.0 | NeurIPS'19 |
| AmoebaNet + NAS-FPN + AA | - | - | 50.7 | arXiv'19 |
| SpineNet | - | - | 52.1 | arXiv'19 |
| CBNet | - | - | 53.3 | AAAI'20 |
| EfficientDet | - | - | 52.6 | CVPR'20 |
| DetectoRS | - | - | 54.7 | arXiv'20 |
[R-CNN] Rich feature hierarchies for accurate object detection and semantic segmentation | [CVPR' 14] |[pdf] [official code - caffe]
[OverFeat] OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks | [ICLR' 14] |[pdf] [official code - torch]
[MultiBox] Scalable Object Detection using Deep Neural Networks | [CVPR' 14] |[pdf]
[SPP-Net] Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition | [ECCV' 14] |[pdf] [official code - caffe] [unofficial code - keras] [unofficial code - tensorflow]
Improving Object Detection with Deep Convolutional Networks via Bayesian Optimization and Structured Prediction | [CVPR' 15] |[pdf] [official code - matlab]
[MR-CNN] Object detection via a multi-region & semantic segmentation-aware CNN model | [ICCV' 15] |[pdf] [official code - caffe]
[DeepBox] DeepBox: Learning Objectness with Convolutional Networks | [ICCV' 15] |[pdf] [official code - caffe]
[AttentionNet] AttentionNet: Aggregating Weak Directions for Accurate Object Detection | [ICCV' 15] |[pdf]
[Fast R-CNN] Fast R-CNN | [ICCV' 15] |[pdf] [official code - caffe]
[DeepProposal] DeepProposal: Hunting Objects by Cascading Deep Convolutional Layers | [ICCV' 15] |[pdf] [official code - matconvnet]
[Faster R-CNN, RPN] Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks | [NIPS' 15] |[pdf] [official code - caffe] [unofficial code - tensorflow] [unofficial code - pytorch]
[YOLO v1] You Only Look Once: Unified, Real-Time Object Detection | [CVPR' 16] |[pdf] [official code - c]
[G-CNN] G-CNN: an Iterative Grid Based Object Detector | [CVPR' 16] |[pdf]
[AZNet] Adaptive Object Detection Using Adjacency and Zoom Prediction | [CVPR' 16] |[pdf]
[ION] Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks | [CVPR' 16] |[pdf]
[HyperNet] HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection | [CVPR' 16] |[pdf]
[OHEM] Training Region-based Object Detectors with Online Hard Example Mining | [CVPR' 16] |[pdf] [official code - caffe]
[CRAPF] CRAFT Objects from Images | [CVPR' 16] |[pdf] [official code - caffe]
[MPN] A MultiPath Network for Object Detection | [BMVC' 16] |[pdf] [official code - torch]
[SSD] SSD: Single Shot MultiBox Detector | [ECCV' 16] |[pdf] [official code - caffe] [unofficial code - tensorflow] [unofficial code - pytorch]
[GBDNet] Crafting GBD-Net for Object Detection | [ECCV' 16] |[pdf] [official code - caffe]
[CPF] Contextual Priming and Feedback for Faster R-CNN | [ECCV' 16] |[[pdf]](https://pdfs.semanticscholar.org/40e
—
$ claude mcp add deep_learning_object_detection \
-- python -m otcore.mcp_server <graph>