Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Engineering vanadium extraction residue into Mn-functionalized hydroxyapatite precursor for enhanced antibiotic removal: Molecular-level insights into a waste-to-resource strategy.

Environmental research·2026
Same author

Attribution Explanations for Deep Neural Networks: A Theoretical Perspective.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

A novel method for acoustic modeling of cranial bone based on the porosity index.

Scientific reports·2025
Same author

Toward Generalizable Prompt Learning via Multi-Regularization Guided Knowledge Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2025
Same author

Identification of drought-tolerant mung bean varieties based on germination, antioxidant, and osmolyte profiles.

Protoplasma·2025
Same author

Sm<sup>3+</sup>-activated zirconate ceramics: multimodal self-calibrating photothermal feedback window for nuclear environments.

Optics letters·2025
Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jul 3, 2025

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.6K

Robust Visual Question Answering: Datasets, Methods, and Future Challenges.

Jie Ma, Pinghui Wang, Dechen Kong

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |February 15, 2024
    PubMed
    Summary
    This summary is machine-generated.

    This survey addresses biases in visual question answering (VQA) systems, which often memorize training data rather than truly understanding images. It reviews datasets, metrics, and debiasing methods to improve VQA robustness.

    More Related Videos

    Deep Neural Networks for Image-Based Dietary Assessment
    13:19

    Deep Neural Networks for Image-Based Dietary Assessment

    Published on: March 13, 2021

    9.2K
    Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions
    06:54

    Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions

    Published on: June 21, 2019

    5.9K

    Related Experiment Videos

    Last Updated: Jul 3, 2025

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
    12:39

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

    Published on: January 18, 2020

    7.6K
    Deep Neural Networks for Image-Based Dietary Assessment
    13:19

    Deep Neural Networks for Image-Based Dietary Assessment

    Published on: March 13, 2021

    9.2K
    Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions
    06:54

    Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions

    Published on: June 21, 2019

    5.9K

    Area of Science:

    • Computer Science
    • Artificial Intelligence
    • Machine Learning

    Background:

    • Visual Question Answering (VQA) systems face challenges with data biases, leading to poor out-of-distribution performance.
    • Existing VQA methods often memorize biases rather than learning grounded image understanding.

    Purpose of the Study:

    • To provide a comprehensive survey of datasets, evaluation metrics, and debiasing methods for VQA.
    • To analyze the robustness of vision-and-language pre-training models in VQA tasks.
    • To identify future research directions in robust VQA.

    Main Methods:

    • Overview of dataset development from in-distribution and out-of-distribution perspectives.
    • Examination of evaluation metrics used in VQA datasets.
    • Proposal of a typology for existing VQA debiasing methods, analyzing their development, features, and comparisons.
    • Analysis of representative vision-and-language pre-training models' robustness on VQA.

    Main Results:

    • The survey categorizes VQA datasets and evaluation metrics, highlighting their evolution and limitations.
    • A structured typology of debiasing methods is presented, detailing their approaches and comparative robustness.
    • Analysis reveals the robustness characteristics of current vision-and-language pre-training models in VQA.

    Conclusions:

    • The study underscores the critical need for robust VQA systems that overcome data biases.
    • It synthesizes current research on VQA robustness, offering a foundational resource for researchers.
    • Key areas for future research are identified to advance the field of reliable visual question answering.