UAV-DETR: An Enhanced RT-DETR Architecture for Efficient Small Object Detection in UAV Imagery
1College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China.
Sensors (Basel, Switzerland)
|August 14, 2025
Related Experiment Videos

03:31
Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
Published on: December 15, 2023
635
06:00Electroantennography-based Bio-hybrid Odor-detecting Drone using Silkmoth Antennae for Odor Source Localization
Published on: August 27, 2021
5.4K
08:50Visually Mediated Odor Tracking During Flight in Drosophila
Published on: January 26, 2009
10.0K
View abstract on PubMed
Summary
UAV-DETR enhances object detection in aerial imagery by improving feature perception and spatial alignment, achieving superior small-object detection performance with fewer parameters.
Area of Science:
- Computer Vision
- Artificial Intelligence
- Machine Learning
Background:
- Aerial imagery from unmanned aerial vehicles (UAVs) presents unique challenges for object detection, including feature degradation and spatial-contextual misalignment.
- Low resolution, complex backgrounds, and dynamic shooting conditions in UAV data hinder accurate detection of small objects.
Purpose of the Study:
- To propose UAV-DETR, an enhanced Transformer-based object detection model specifically designed to address the challenges of aerial imagery.
- To improve feature perception, semantic representation, and spatial alignment for more robust small-object detection in UAV-acquired data.
Main Methods:
- UAV-DETR extends the RT-DETR framework with three novel modules: Channel-Aware Sensing (CAS), Scale-Optimized Enhancement Pyramid (SOEP), and Context-Spatial Alignment (CSAM).
- CAS refines the backbone for better multi-scale feature perception.
- SOEP enhances shallow layer semantic richness via channel-weighted fusion.
- CSAM optimizes the hybrid encoder for contextual and spatial calibration, improving cross-scale integration.
Main Results:
- UAV-DETR demonstrates superior small-object detection performance on the VisDrone2019 dataset, achieving an mAP@0.5 of 51.6%.
- The model outperforms the baseline RT-DETR by 3.5% in mAP@0.5.
- UAV-DETR reduces model parameters from 19.8 million to 16.8 million compared to RT-DETR, indicating improved efficiency.
Conclusions:
- The proposed UAV-DETR effectively mitigates technical challenges in UAV-based object detection, particularly for small objects.
- The novel modules (CAS, SOEP, CSAM) significantly enhance feature perception, semantic representation, and spatial-contextual alignment.
- UAV-DETR offers a promising solution for accurate and efficient object detection in complex aerial scenarios.