Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Visual System01:26

Visual System

1.6K
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
1.6K
Vision01:24

Vision

59.2K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
59.2K
Visual Agnosia01:12

Visual Agnosia

899
Visual agnosia is a condition characterized by the inability to recognize visually presented objects despite having normal vision. For instance, a person with visual agnosia can describe the shape and color of an object but cannot identify or name it. This impairment does not affect their visual field, acuity, color vision, brightness discrimination, language, or memory. An example of this condition in a social setting is someone at a dinner party asking for "that silver thing with a round...
899
Information Processing Approach01:30

Information Processing Approach

494
The information-processing theory of cognitive development centers on fundamental mental processes, including attention, memory, and problem-solving skills. Researchers in this field examine how cognitive abilities, such as working memory, evolve and influence children's overall development. Studies indicate that children with stronger working memory tend to excel in reading comprehension, math, and problem-solving compared to peers with less efficient memory skills. Low working memory is...
494
Inductive Reasoning00:59

Inductive Reasoning

64.5K
Inductive reasoning is a form of logical thinking that uses related observations to arrive at a general conclusion. It is uncertain and operates in degrees to which the conclusions are credible. As such, inductive arguments can be weak or strong, rather than valid or invalid, and conclusions can be used to formulate testable, falsifiable hypotheses.
Inductive reasoning is common in descriptive science. A life scientist makes observations and records them. This data can be qualitative or...
64.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Roles of NRXN1 in neuropsychiatric disorders: from genetic lesion to molecular mechanism.

Frontiers in neuroscience·2026
Same author

SSA-KD: Self-structure-aware knowledge distillation for convolutional neural networks.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

Dihydromyricetin-loaded cerium oxide nanoparticles promote deep burn wound healing.

Biomedical materials (Bristol, England)·2026
Same author

Knowledge graph-based cognitive learning with multi-fact reasoning.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

Evolutionary Multiobjective Neural Architecture Search for Binary Neural Networks by Two-Stage Optimization.

IEEE transactions on cybernetics·2026
Same author

A Dynamics-GCN Hybrid Framework for Feature Learning in Disease-Related Association Prediction.

IEEE transactions on computational biology and bioinformatics·2025
Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

IGFD-Net: Illumination-guided frequency decoupling for polarization image fusion.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Multiple-Strategies dung beetle optimizer and its applications in engineering optimization and bankruptcy prediction.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026
Same journal

Finite-Time intermittent control for secure synchronization of Neutral-Type stochastic delayed neural networks under aperiodic DoS attacks.

Neural networks : the official journal of the International Neural Network Society·2026
See all related articles

Related Experiment Video

Updated: Jan 9, 2026

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
07:36

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

16.3K

RCVQA: Visual question answering model based on reading comprehension.

Deguang Chen1, Jianrui Chen2, Zhongshi Shao1

  • 1School of Artificial Inteligence and Computer Science, Shaanxi Normal University, Xi'an, 710119, Shaanxi, China.

Neural Networks : the Official Journal of the International Neural Network Society
|December 5, 2025
PubMed
Summary
This summary is machine-generated.

RCVQA, a novel reading-comprehension-based Visual Question Answering (VQA) model, enhances image and question analysis. It achieves multi-span answer retrieval and introduces new metrics for comprehensive evaluation, outperforming existing methods.

Keywords:
Computer visionMulti-span answersNatural language processingReading comprehension modelVisual question answering

More Related Videos

Decomposing the Variance in Reading Comprehension to Reveal the Unique and Common Effects of Language and Decoding
06:33

Decomposing the Variance in Reading Comprehension to Reveal the Unique and Common Effects of Language and Decoding

Published on: October 11, 2018

7.2K
Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading
05:54

Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading

Published on: October 18, 2018

6.6K

Related Experiment Videos

Last Updated: Jan 9, 2026

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
07:36

Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

Published on: November 30, 2018

16.3K
Decomposing the Variance in Reading Comprehension to Reveal the Unique and Common Effects of Language and Decoding
06:33

Decomposing the Variance in Reading Comprehension to Reveal the Unique and Common Effects of Language and Decoding

Published on: October 11, 2018

7.2K
Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading
05:54

Eye-tracking to Distinguish Comprehension-based and Oculomotor-based Regressive Eye Movements During Reading

Published on: October 18, 2018

6.6K

Area of Science:

  • Artificial Intelligence
  • Computer Vision
  • Natural Language Processing

Background:

  • Current Visual Question Answering (VQA) models struggle with deep semantic analysis due to limited interdisciplinary interaction.
  • Existing VQA systems often provide limited, single-word answers, failing to meet diverse user needs.
  • Over-reliance on accuracy as an evaluation metric hinders comprehensive performance assessment and model optimization.

Purpose of the Study:

  • To introduce RCVQA, a novel reading-comprehension-based VQA model addressing current limitations.
  • To develop innovative methods for multi-span answer retrieval, including answer position and content prediction.
  • To propose new evaluation metrics for a more thorough assessment of VQA system performance.

Main Methods:

  • Preprocessing dataset text to remove irrelevant information for enhanced contextual focus.
  • Developing RCVQA model variants (RCVQAP, RCVQAC, VQAT) and testing across four datasets.
  • Implementing answer position and content prediction algorithms for multi-span answer retrieval.
  • Introducing four novel evaluation metrics (PPR∥, TPR∥, H∥-Means, ESM∥) for comprehensive VQA system evaluation.
  • Integrating image captioning and local training strategies for improved content understanding and data security.

Main Results:

  • RCVQA significantly outperforms state-of-the-art methods on multiple benchmarks.
  • Achieved improvements ranging from 1% to 7% across datasets like A-OKVQA, KR-VQA, GQA, and OK-VQA.
  • Demonstrated enhanced performance in tasks requiring deeper semantic understanding and more comprehensive answers.

Conclusions:

  • RCVQA represents a significant advancement in Visual Question Answering.
  • The proposed methodology and evaluation metrics offer a more robust approach to VQA.
  • The RCVQA model provides more diverse, complete, and accurate answers, paving the way for broader practical applications.