Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

Xiuquan Hou, Meiqin Liu, Shaoyi Du

IEEE Transactions on Pattern Analysis and Machine Intelligence

|June 29, 2026

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Symmetric Entropy-Constrained Video Coding for Machines.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026

Same author

Precision-engineered STING agonist nanoparticles enable coordinated mucosal-systemic immunity for durable pan-β-coronavirus protection.

Nature nanotechnology·2026

Same author

Robust point cloud registration based on semantic iterative closest point algorithm.

Fundamental research·2026

Same author

AsyCMST: Asymmetric cross-modal spatio-temporal learning for multimodal ultrasound nodule recognition.

Medical image analysis·2026

Same author

Hyper-RAG: combating LLM hallucinations using hypergraph-driven retrieval-augmented generation.

Nature communications·2026

Same author

TSFA: A Two-Stage Feature Alignment Method for Unsupervised Open-Set Domain Adaptation in Time-Series Classification.

IEEE transactions on neural networks and learning systems·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

MonSter++: Unified Stereo Matching, Multi-View Stereo, and Real-Time Stereo With Monodepth Priors.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

This study enhances DETR (DEtection TRansformer) performance by introducing position relation embeddings to improve attention mechanisms. The Relation-DETR+ framework boosts convergence and efficiency for various dense prediction tasks.

Area of Science:

Computer Vision
Deep Learning
Artificial Intelligence

Background:

Object detection models like DETR (DEtection TRansformer) suffer from slow convergence due to self-attention lacking structural input bias.
Existing methods struggle to effectively integrate spatial relationships, limiting performance in dense prediction tasks.

Purpose of the Study:

To propose a novel scheme, Relation-DETR+, for enhancing the convergence and performance of DETR-based models.
To address the slow convergence issue by incorporating explicit position relation priors into the attention mechanism.
To develop a unified framework capable of handling multiple dense prediction tasks including object detection, semantic segmentation, instance segmentation, and panoptic segmentation.

Main Methods:

Incorporation of position relation prior as attention bias to augment object detection.

Related Experiment Videos

Introduction of an encoder for constructing position relation embeddings for progressive attention refinement.

Extension of the DETR pipeline into a contrastive relation pipeline to manage prediction conflicts.

Alleviation of pattern collapse in multi-layer relations via explicit layer-wise encoding and gated relation modulation.

Main Results:

Relation-DETR+ demonstrates superior performance and learning efficiency compared to state-of-the-art methods like DINO and Mask-DINO under similar training schedules.
The proposed relation encoder acts as a plug-and-play component, improving various DETR-like methods.
Extensive experiments on diverse datasets validate the effectiveness of the approach for unified dense prediction tasks.
Introduction of SA-Det-100k, a large-scale class-agnostic detection dataset, highlighting the potential of explicit position relations for universal object detection.

Conclusions:

The proposed Relation-DETR+ framework effectively enhances DETR performance by integrating explicit position relations, leading to improved convergence and efficiency.
The approach offers a unified solution for multiple dense prediction tasks within a single framework.
The relation encoder is a versatile component applicable to a wide range of DETR-based architectures, advancing the field of computer vision.