Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Engineering vanadium extraction residue into Mn-functionalized hydroxyapatite precursor for enhanced antibiotic removal: Molecular-level insights into a waste-to-resource strategy.

Environmental research·2026

Same author

Attribution Explanations for Deep Neural Networks: A Theoretical Perspective.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

A novel method for acoustic modeling of cranial bone based on the porosity index.

Scientific reports·2025

Same author

Toward Generalizable Prompt Learning via Multi-Regularization Guided Knowledge Distillation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2025

Same author

Identification of drought-tolerant mung bean varieties based on germination, antioxidant, and osmolyte profiles.

Protoplasma·2025

Same author

Sm<sup>3+</sup>-activated zirconate ceramics: multimodal self-calibrating photothermal feedback window for nuclear environments.

Optics letters·2025

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 3, 2025

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

Robust Visual Question Answering: Datasets, Methods, and Future Challenges.

Jie Ma, Pinghui Wang, Dechen Kong

IEEE Transactions on Pattern Analysis and Machine Intelligence

|February 15, 2024

Summary

This summary is machine-generated.

This survey addresses biases in visual question answering (VQA) systems, which often memorize training data rather than truly understanding images. It reviews datasets, metrics, and debiasing methods to improve VQA robustness.

More Related Videos

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions

Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions

Published on: June 21, 2019

Related Experiment Videos

Last Updated: Jul 3, 2025

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

Deep Neural Networks for Image-Based Dietary Assessment

Deep Neural Networks for Image-Based Dietary Assessment

Published on: March 13, 2021

Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions

Methods for Presenting Real-world Objects Under Controlled Laboratory Conditions

Published on: June 21, 2019

Area of Science:

Computer Science
Artificial Intelligence
Machine Learning

Background:

Visual Question Answering (VQA) systems face challenges with data biases, leading to poor out-of-distribution performance.
Existing VQA methods often memorize biases rather than learning grounded image understanding.

Purpose of the Study:

To provide a comprehensive survey of datasets, evaluation metrics, and debiasing methods for VQA.
To analyze the robustness of vision-and-language pre-training models in VQA tasks.
To identify future research directions in robust VQA.

Main Methods:

Overview of dataset development from in-distribution and out-of-distribution perspectives.
Examination of evaluation metrics used in VQA datasets.
Proposal of a typology for existing VQA debiasing methods, analyzing their development, features, and comparisons.
Analysis of representative vision-and-language pre-training models' robustness on VQA.

Main Results:

The survey categorizes VQA datasets and evaluation metrics, highlighting their evolution and limitations.
A structured typology of debiasing methods is presented, detailing their approaches and comparative robustness.
Analysis reveals the robustness characteristics of current vision-and-language pre-training models in VQA.

Conclusions:

The study underscores the critical need for robust VQA systems that overcome data biases.
It synthesizes current research on VQA robustness, offering a foundational resource for researchers.
Key areas for future research are identified to advance the field of reliable visual question answering.