Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Magnetic Resonance Spectroscopy Deep Learning with Magnetic Resonance Background Generator Enables In Vivo Metabolite Quantification of Hepatic Encephalopathy.

IEEE transactions on bio-medical engineering·2026

Same author

Cable bacteria drive electrochemical coupling and elemental cycling in rhizosphere: A review.

Ying yong sheng tai xue bao = The journal of applied ecology·2026

Same author

Functionalized carbon nanotube-assisted dual-mode CRISPR/Cas12a detection of hepatitis C virus via catalytic assembly circuit-driven Y-shaped dsDNA activators.

Biosensors & bioelectronics·2026

Same author

Atomically confined insertion for 2D strain and polarization engineered GaN electronics.

Nature communications·2026

Same author

Revealing the Microscopic Structure and Adsorption Mechanism of Imidazolium-Based Ionic Liquids on the Interface and Interlayer of MXenes: A First-Principles Study.

Langmuir : the ACS journal of surfaces and colloids·2026

Same author

Efficacy of tranexamic acid for prevention of heterotopic ossification after orthopedic surgery: a systematic review and meta-analysis.

BMC surgery·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 26, 2025

Modeling the Functional Network for Spatial Navigation in the Human Brain

Modeling the Functional Network for Spatial Navigation in the Human Brain

Published on: October 13, 2023

Self-Supervised 3-D Semantic Representation Learning for Vision-and-Language Navigation.

Sinan Tan, Kuankuan Sima, Dunzheng Wang

IEEE Transactions on Neural Networks and Learning Systems

|May 14, 2024

Summary

This summary is machine-generated.

This study introduces a new vision-and-language navigation (VLN) framework that uses 3-D semantic data alongside RGB images. This approach significantly improves navigation performance by creating detailed 3-D environment representations.

More Related Videos

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Related Experiment Videos

Last Updated: Jun 26, 2025

Modeling the Functional Network for Spatial Navigation in the Human Brain

Modeling the Functional Network for Spatial Navigation in the Human Brain

Published on: October 13, 2023

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Area of Science:

Robotics
Computer Vision
Artificial Intelligence

Background:

Current vision-and-language navigation (VLN) methods predominantly rely on RGB images, neglecting valuable 3-D semantic environmental data.
Integrating 3-D semantic information can provide a richer understanding of the environment for more robust navigation.

Purpose of the Study:

To develop a novel VLN framework that effectively incorporates 3-D semantic information into the navigation process.
To enhance navigation accuracy and efficiency by leveraging comprehensive environmental representations.

Main Methods:

A self-supervised training scheme using voxel-level 3-D semantic reconstruction to build detailed 3-D semantic representations.
A pretext task involving region queries to identify objects within specific 3-D areas.
An LSTM-based navigation model trained on 3-D semantic representations, enhanced by a cross-modal distillation strategy to merge RGB and 3-D semantic features.

Main Results:

The proposed framework successfully integrates 3-D semantic data into VLN tasks.
Cross-modal distillation effectively merges RGB and 3-D semantic features, improving model performance.
Evaluations on R2R and R4R datasets demonstrate significant performance enhancements in VLN tasks.

Conclusions:

Integrating 3-D semantic information is crucial for advancing vision-and-language navigation.
The developed framework offers a promising approach for more intelligent and accurate robotic navigation systems.
Future work can explore further refinements in 3-D reconstruction and cross-modal fusion techniques.