English | 简体中文
<a href="https://github.com/lyuwenyu/RT-DETR/blob/main/LICENSE">
<img alt="license" src="https://img.shields.io/github/license/lyuwenyu/RT-DETR">
</a>
<a href="https://github.com/lyuwenyu/RT-DETR/pulls">
<img alt="prs" src="https://img.shields.io/github/issues-pr/lyuwenyu/RT-DETR">
</a>
<a href="https://github.com/lyuwenyu/RT-DETR/issues">
<img alt="issues" src="https://img.shields.io/github/issues/lyuwenyu/RT-DETR?color=pink">
</a>
<a href="https://github.com/lyuwenyu/RT-DETR">
<img alt="issues" src="https://img.shields.io/github/stars/lyuwenyu/RT-DETR">
</a>
<a href="https://arxiv.org/abs/2304.08069">
<img alt="arXiv" src="https://img.shields.io/badge/arXiv-2304.08069-red">
</a>
<a href="mailto: lyuwenyu@foxmail.com">
<img alt="emal" src="https://img.shields.io/badge/contact_me-email-yellow">
</a>
This is the official implementation of the paper "DETRs Beat YOLOs on Real-time Object Detection".
remap_mscoco_category to facilitate training of custom datasets, see detils in Train custom data part. #81.| Model | Epoch | Input shape | Dataset | $AP^{val}$ | $AP^{val}_{50}$ | Params(M) | FLOPs(G) | T4 TensorRT FP16(FPS) |
|---|---|---|---|---|---|---|---|---|
| RT-DETR-R18 | 6x | 640 | COCO | 46.5 | 63.8 | 20 | 60 | 217 |
| RT-DETR-R34 | 6x | 640 | COCO | 48.9 | 66.8 | 31 | 92 | 161 |
| RT-DETR-R50-m | 6x | 640 | COCO | 51.3 | 69.6 | 36 | 100 | 145 |
| RT-DETR-R50 | 6x | 640 | COCO | 53.1 | 71.3 | 42 | 136 | 108 |
| RT-DETR-R101 | 6x | 640 | COCO | 54.3 | 72.7 | 76 | 259 | 74 |
| RT-DETR-HGNetv2-L | 6x | 640 | COCO | 53.0 | 71.6 | 32 | 110 | 114 |
| RT-DETR-HGNetv2-X | 6x | 640 | COCO | 54.8 | 73.1 | 67 | 234 | 74 |
| RT-DETR-R18 | 5x | 640 | COCO + Objects365 | 49.2 | 66.6 | 20 | 60 | 217 |
| RT-DETR-R50 | 2x | 640 | COCO + Objects365 | 55.3 | 73.4 | 42 | 136 | 108 |
| RT-DETR-R101 | 2x | 640 | COCO + Objects365 | 56.2 | 74.6 | 76 | 259 | 74 |
Notes:
- COCO + Objects365 in the table means finetuned model on COCO using pretrained weights trained on Objects365.
We propose a Real-Time DEtection TRansformer (RT-DETR, aka RTDETR), the first real-time end-to-end object detector to our best knowledge. Our RT-DETR-R50 / R101 achieves 53.1% / 54.3% AP on COCO and 108 / 74 FPS on T4 GPU, outperforming previously advanced YOLOs in both speed and accuracy. Furthermore, RT-DETR-R50 outperforms DINO-R50 by 2.2% AP in accuracy and about 21 times in FPS. After pre-training with Objects365, RT-DETR-R50 / R101 achieves 55.3% / 56.2% AP.
If you use RT-DETR in your work, please use the following BibTeX entries:
@misc{lv2023detrs,
title={DETRs Beat YOLOs on Real-time Object Detection},
author={Yian Zhao and Wenyu Lv and Shangliang Xu and Jinman Wei and Guanzhong Wang and Qingqing Dang and Yi Liu and Jie Chen},
year={2023},
eprint={2304.08069},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
$ claude mcp add RT-DETR \
-- python -m otcore.mcp_server <graph>