Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

631
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
631
Vision01:24

Vision

53.2K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
53.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Magnetic Resonance Spectroscopy Deep Learning with Magnetic Resonance Background Generator Enables In Vivo Metabolite Quantification of Hepatic Encephalopathy.

IEEE transactions on bio-medical engineering·2026
Same author

Cable bacteria drive electrochemical coupling and elemental cycling in rhizosphere: A review.

Ying yong sheng tai xue bao = The journal of applied ecology·2026
Same author

Functionalized carbon nanotube-assisted dual-mode CRISPR/Cas12a detection of hepatitis C virus via catalytic assembly circuit-driven Y-shaped dsDNA activators.

Biosensors & bioelectronics·2026
Same author

Atomically confined insertion for 2D strain and polarization engineered GaN electronics.

Nature communications·2026
Same author

Revealing the Microscopic Structure and Adsorption Mechanism of Imidazolium-Based Ionic Liquids on the Interface and Interlayer of MXenes: A First-Principles Study.

Langmuir : the ACS journal of surfaces and colloids·2026
Same author

Efficacy of tranexamic acid for prevention of heterotopic ossification after orthopedic surgery: a systematic review and meta-analysis.

BMC surgery·2026
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Jun 26, 2025

Modeling the Functional Network for Spatial Navigation in the Human Brain
05:55

Modeling the Functional Network for Spatial Navigation in the Human Brain

Published on: October 13, 2023

1.0K

Self-Supervised 3-D Semantic Representation Learning for Vision-and-Language Navigation.

Sinan Tan, Kuankuan Sima, Dunzheng Wang

    IEEE Transactions on Neural Networks and Learning Systems
    |May 14, 2024
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a new vision-and-language navigation (VLN) framework that uses 3-D semantic data alongside RGB images. This approach significantly improves navigation performance by creating detailed 3-D environment representations.

    More Related Videos

    A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants
    06:28

    A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

    Published on: August 26, 2018

    6.0K
    Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine
    07:05

    Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

    Published on: October 27, 2016

    9.2K

    Related Experiment Videos

    Last Updated: Jun 26, 2025

    Modeling the Functional Network for Spatial Navigation in the Human Brain
    05:55

    Modeling the Functional Network for Spatial Navigation in the Human Brain

    Published on: October 13, 2023

    1.0K
    A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants
    06:28

    A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

    Published on: August 26, 2018

    6.0K
    Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine
    07:05

    Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

    Published on: October 27, 2016

    9.2K

    Area of Science:

    • Robotics
    • Computer Vision
    • Artificial Intelligence

    Background:

    • Current vision-and-language navigation (VLN) methods predominantly rely on RGB images, neglecting valuable 3-D semantic environmental data.
    • Integrating 3-D semantic information can provide a richer understanding of the environment for more robust navigation.

    Purpose of the Study:

    • To develop a novel VLN framework that effectively incorporates 3-D semantic information into the navigation process.
    • To enhance navigation accuracy and efficiency by leveraging comprehensive environmental representations.

    Main Methods:

    • A self-supervised training scheme using voxel-level 3-D semantic reconstruction to build detailed 3-D semantic representations.
    • A pretext task involving region queries to identify objects within specific 3-D areas.
    • An LSTM-based navigation model trained on 3-D semantic representations, enhanced by a cross-modal distillation strategy to merge RGB and 3-D semantic features.

    Main Results:

    • The proposed framework successfully integrates 3-D semantic data into VLN tasks.
    • Cross-modal distillation effectively merges RGB and 3-D semantic features, improving model performance.
    • Evaluations on R2R and R4R datasets demonstrate significant performance enhancements in VLN tasks.

    Conclusions:

    • Integrating 3-D semantic information is crucial for advancing vision-and-language navigation.
    • The developed framework offers a promising approach for more intelligent and accurate robotic navigation systems.
    • Future work can explore further refinements in 3-D reconstruction and cross-modal fusion techniques.