Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A Survey on 3D Gaussian Splatting Applications: Segmentation, Editing, and Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

XOV-Action: Towards Generalizable Open-Vocabulary Action Recognition.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Human-Structure-Aware Token Position Embedding for Tokenized Pose Estimation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Holistic Invariant Retracing for Distortion-Resilient Multi-Modal Learning in Spatial Transcriptomics.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Crafting Your Evolving Dreams: Concept-Incremental Versatile Customization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Event-Aware Instructed Assistant for Referring Video Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Change-Prior-Guided Unsupervised Change Detection of Heterogeneous Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

AgonicDreamer: Enhancing Multi-View Consistency in Text-to-3D Generation via Rectified Score Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

BiCM-Prompt: Bidirectional Cross-Modal Prompt Tuning for Class-Incremental Learning on Multisource Remote Sensing Images.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

GoP-based Quality Enhancement on Video Compression.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Align then Tensorize: Multi-Level Consistent Anchor Graph Learning for Scalable Multi-View Clustering.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same journal

Beyond Fidelity: Diverse Image Synthesis via Retrieval-Augmented Diffusion.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
See all related articles

Related Experiment Video

Updated: Jul 1, 2025

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
04:48

Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

Published on: November 30, 2022

2.7K

Toward Robust Referring Image Segmentation.

Jianzong Wu, Xiangtai Li, Xia Li

    IEEE Transactions on Image Processing : a Publication of the IEEE Signal Processing Society
    |March 5, 2024
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces Robust Referring Image Segmentation (R-RIS) to handle incorrect text descriptions in vision-language tasks. The new RefSegformer model achieves state-of-the-art results on both standard and robust referring image segmentation.

    More Related Videos

    Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images
    06:48

    Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images

    Published on: January 7, 2019

    8.9K
    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    399

    Related Experiment Videos

    Last Updated: Jul 1, 2025

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography
    04:48

    Application of Deep Learning-Based Medical Image Segmentation via Orbital Computed Tomography

    Published on: November 30, 2022

    2.7K
    Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images
    06:48

    Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images

    Published on: January 7, 2019

    8.9K
    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
    04:48

    Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

    Published on: July 5, 2024

    399

    Area of Science:

    • Computer Vision
    • Natural Language Processing
    • Artificial Intelligence

    Background:

    • Referring Image Segmentation (RIS) is a key vision-language task for generating object masks from text.
    • Existing RIS models struggle with inaccurate or misleading text descriptions, termed negative sentences.
    • A robust approach is needed to handle both accurate and inaccurate textual guidance.

    Purpose of the Study:

    • To introduce Robust Referring Image Segmentation (R-RIS) capable of processing negative sentences.
    • To develop new datasets and evaluation metrics for R-RIS.
    • To propose a novel transformer-based model, RefSegformer, for enhanced RIS.

    Main Methods:

    • Created three R-RIS datasets by augmenting existing RIS datasets with negative sentences.
    • Developed unified metrics to evaluate performance on both positive and negative text inputs.
    • Proposed RefSegformer, a transformer model with a token-based fusion module, adaptable for R-RIS.

    Main Results:

    • RefSegformer achieved state-of-the-art performance on established RIS benchmarks.
    • The model demonstrated strong capabilities in the new R-RIS setting, handling negative sentences effectively.
    • Established a new baseline for robust referring image segmentation.

    Conclusions:

    • The proposed R-RIS formulation and RefSegformer model offer a significant advancement in vision-language tasks.
    • The approach effectively addresses the challenge of inaccurate textual descriptions in image segmentation.
    • This work provides a robust foundation for future research in referring image segmentation.