Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Visual System01:26

Visual System

774
Light enters the eye through the cornea, a transparent, dome-shaped surface covering the surface of the eyeball that helps to direct and focus incoming light. This light is then channeled toward the pupil, an adjustable opening whose size is controlled by the iris. The iris, a pigmented muscle, regulates the amount of light entering the eye by contracting or dilating the pupil, thereby ensuring optimal light levels for clear vision.
Once through the pupil, the light passes through the lens, a...
774
Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

1.1K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
1.1K
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

186
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...
186

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Clade 2.3.4.4b H5N1 HPAIV from Migratory Birds in Beidaihe Wetland, North China.

Viruses·2026
Same author

Spatiotemporal dynamics of renal distal convoluted tubule dilatation and cyst formation in nephronophthisis type 1 mice.

Renal failure·2026
Same author

Speckle-tracking echocardiography reveals the synergistic impact of GH/IGF-1 excess and metabolic dysregulation on cardiac dysfunction in acromegaly.

Pituitary·2026
Same author

A nanobody-based proteolysis-targeting chimera offers broad-spectrum protection against diverse influenza virus infections.

Signal transduction and targeted therapy·2026
Same author

Road Traffic Anomaly Detection by Human-Attention-Assisted Text-Vision Learning.

Sensors (Basel, Switzerland)·2026
Same author

Alterations in gut microbiota and metabolic profiling are associated with papillary thyroid cancer and BRAF<sup>V600E</sup> mutation.

Endocrine·2026
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Oct 3, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

670

Deep Modular Bilinear Attention Network for Visual Question Answering.

Feng Yan1, Wushouer Silamu1,2, Yanbing Li1

  • 1School of Information Science and Engineering, Xinjiang University, Urumqi 830046, China.

Sensors (Basel, Switzerland)
|February 15, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces the Deep Multimodality Bilinear Attention Network (DMBA-NET) for Visual Question Answering (VQA). The novel framework achieves 70.85% accuracy on VQA 2.0 by enhancing attention mechanisms and question encoding.

Keywords:
attention mechanismbilinear attention networkmulti-modelvisual question answering

More Related Videos

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

538
A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.8K

Related Experiment Videos

Last Updated: Oct 3, 2025

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications
03:31

Author Spotlight: Enhancement of Salient Object Detection for Smart Grid Applications

Published on: December 15, 2023

670
Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique
04:48

Swin-PSAxialNet: An Efficient Multi-Organ Segmentation Technique

Published on: July 5, 2024

538
A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
12:39

A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

Published on: January 18, 2020

7.8K

Area of Science:

  • Artificial Intelligence
  • Computer Vision
  • Natural Language Processing

Background:

  • Visual Question Answering (VQA) integrates image understanding and language comprehension.
  • Attention mechanisms are crucial in VQA, with dot-product being a common approach.
  • Existing VQA models often rely on specific datasets for augmentation.

Purpose of the Study:

  • To propose a novel Deep Multimodality Bilinear Attention Network (DMBA-NET) for VQA.
  • To improve inter-modality and intra-modality relation modeling using bilinear attention.
  • To enhance question understanding through dynamic word vectors and self-attention.

Main Methods:

  • Developed DMBA-NET framework utilizing two basic attention units: BAN-GA and BAN-SA.
  • Employed Bilinear Attention Network (BAN) for attention calculation.
  • Utilized BERT (Bidirectional Encoder Representations from Transformers) for dynamic question encoding and self-attention.

Main Results:

  • Achieved 70.85% accuracy on the VQA 2.0 test-std dataset.
  • Demonstrated effective inter-modality and intra-modality relation construction.
  • Successfully processed visual and language features for accurate answer prediction.

Conclusions:

  • DMBA-NET offers a robust framework for VQA tasks.
  • The proposed attention units and question encoding methods enhance model performance.
  • The model achieves high accuracy without relying on external dataset augmentation.