Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

1.1K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
1.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Yeast nuclei-mediated precise delivery of synthetic megabase-scale human DNA into mammalian embryos.

Nature protocols·2026
Same author

TFPI2 promotes NK cell-mediated glioblastoma killing through adhesion and checkpoint control.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

Single and combined use of the platelet-lymphocyte ratio and neutrophil-lymphocyte ratio in hemorrhagic fever with renal syndrome.

Frontiers in cellular and infection microbiology·2026
Same author

Coarse Labels Matter: Revisiting the Role of Coarse-Grained Supervision in Fine-Grained Learning.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Integrating SAM Supervision for 3D Weakly Supervised Point Cloud Segmentation.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Engineering carbon nanofiber-supported NiCo/CoNi<sub>2</sub>S<sub>4</sub> Mott-Schottky heterostructure with robust interfacial electric field for boosting oxygen evolution reaction kinetics.

Journal of colloid and interface science·2026
Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Task-KV: Task-aware KV Cache Optimization via Semantic Differentiation of Attention Heads.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Achieving Text-based Person Retrieval with Any Granularity.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Oct 8, 2025

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
06:37

Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

Published on: December 15, 2023

4.3K

Depth and Video Segmentation Based Visual Attention for Embodied Question Answering.

Haonan Luo, Guosheng Lin, Yazhou Yao

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |January 4, 2022
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a novel visual attention mechanism for Embodied Question Answering (EQA) agents, improving both navigation and answering accuracy in real-world environments. The new method enhances semantic understanding and spatial awareness for better robot assistant performance.

    More Related Videos

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
    12:39

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

    Published on: January 18, 2020

    7.8K
    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
    07:36

    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

    Published on: November 30, 2018

    15.9K

    Related Experiment Videos

    Last Updated: Oct 8, 2025

    Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention
    06:37

    Author Spotlight: Addressing Technical and Subjective Challenges in Measuring Classroom Attention

    Published on: December 15, 2023

    4.3K
    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers
    12:39

    A Methodology for Capturing Joint Visual Attention Using Mobile Eye-Trackers

    Published on: January 18, 2020

    7.8K
    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects
    07:36

    Eye Tracking During Visually Situated Language Comprehension: Flexibility and Limitations in Uncovering Visual Context Effects

    Published on: November 30, 2018

    15.9K

    Area of Science:

    • Artificial Intelligence
    • Robotics
    • Computer Vision

    Background:

    • Embodied Question Answering (EQA) involves agents interacting with environments to answer questions.
    • Current EQA methods struggle with accuracy due to limited semantic and spatial information.
    • Applications include personal assistants and in-home robots.

    Purpose of the Study:

    • To enhance the accuracy of Embodied Question Answering (EQA).
    • To address limitations in semantic understanding and spatial reasoning in existing EQA models.

    Main Methods:

    • Proposed a depth and segmentation-based visual attention mechanism for EQA.
    • Introduced a high-speed video segmentation framework for local semantic feature extraction.
    • Developed a feature fusion strategy to guide navigator training.

    Main Results:

    • The visual attention mechanism improved Visual Question Answering (VQA) performance.
    • Achieved significant overall accuracy improvements on House3D (4.9%) and Matterport3D (5.6%) datasets.
    • Demonstrated effective boosting of both VQA and navigation modules.

    Conclusions:

    • The proposed method enhances EQA by integrating depth, segmentation, and visual attention.
    • The approach offers improved performance without substantial computational overhead.
    • This work advances the capabilities of intelligent agents in interactive environments.