Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

Xiuquan Hou, Meiqin Liu, Shaoyi Du

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |June 29, 2026
    PubMed
    Summary
    This summary is machine-generated.

    Related Concept Videos

    You might also read

    Related Articles

    Articles linked to this work by shared authors, journal, and citation graph.

    Sort by
    Same author

    Symmetric Entropy-Constrained Video Coding for Machines.

    IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
    Same author

    Precision-engineered STING agonist nanoparticles enable coordinated mucosal-systemic immunity for durable pan-β-coronavirus protection.

    Nature nanotechnology·2026
    Same author

    Robust point cloud registration based on semantic iterative closest point algorithm.

    Fundamental research·2026
    Same author

    AsyCMST: Asymmetric cross-modal spatio-temporal learning for multimodal ultrasound nodule recognition.

    Medical image analysis·2026
    Same author

    Hyper-RAG: combating LLM hallucinations using hypergraph-driven retrieval-augmented generation.

    Nature communications·2026
    Same author

    TSFA: A Two-Stage Feature Alignment Method for Unsupervised Open-Set Domain Adaptation in Time-Series Classification.

    IEEE transactions on neural networks and learning systems·2026
    Same journal

    RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    Learning Shape Anchors for Holistic Indoor Scene Understanding.

    IEEE transactions on pattern analysis and machine intelligence·2026
    Same journal

    MonSter++: Unified Stereo Matching, Multi-View Stereo, and Real-Time Stereo With Monodepth Priors.

    IEEE transactions on pattern analysis and machine intelligence·2026
    See all related articles

    This study enhances DETR (DEtection TRansformer) performance by introducing position relation embeddings to improve attention mechanisms. The Relation-DETR+ framework boosts convergence and efficiency for various dense prediction tasks.

    Area of Science:

    • Computer Vision
    • Deep Learning
    • Artificial Intelligence

    Background:

    • Object detection models like DETR (DEtection TRansformer) suffer from slow convergence due to self-attention lacking structural input bias.
    • Existing methods struggle to effectively integrate spatial relationships, limiting performance in dense prediction tasks.

    Purpose of the Study:

    • To propose a novel scheme, Relation-DETR+, for enhancing the convergence and performance of DETR-based models.
    • To address the slow convergence issue by incorporating explicit position relation priors into the attention mechanism.
    • To develop a unified framework capable of handling multiple dense prediction tasks including object detection, semantic segmentation, instance segmentation, and panoptic segmentation.

    Main Methods:

    • Incorporation of position relation prior as attention bias to augment object detection.

    Related Experiment Videos

  • Introduction of an encoder for constructing position relation embeddings for progressive attention refinement.
  • Extension of the DETR pipeline into a contrastive relation pipeline to manage prediction conflicts.
  • Alleviation of pattern collapse in multi-layer relations via explicit layer-wise encoding and gated relation modulation.
  • Main Results:

    • Relation-DETR+ demonstrates superior performance and learning efficiency compared to state-of-the-art methods like DINO and Mask-DINO under similar training schedules.
    • The proposed relation encoder acts as a plug-and-play component, improving various DETR-like methods.
    • Extensive experiments on diverse datasets validate the effectiveness of the approach for unified dense prediction tasks.
    • Introduction of SA-Det-100k, a large-scale class-agnostic detection dataset, highlighting the potential of explicit position relations for universal object detection.

    Conclusions:

    • The proposed Relation-DETR+ framework effectively enhances DETR performance by integrating explicit position relations, leading to improved convergence and efficiency.
    • The approach offers a unified solution for multiple dense prediction tasks within a single framework.
    • The relation encoder is a versatile component applicable to a wide range of DETR-based architectures, advancing the field of computer vision.