Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Depth Perception and Spatial Vision

Depth Perception and Spatial Vision

Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.

Vision

Vision

Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Machine learning early warning for urban heat risk with CMIP6 projections.

Journal of environmental management·2026

Same author

Author Correction: Long-term, in toto live imaging of cardiomyocyte behaviour during mouse ventricle chamber formation at single-cell resolution.

Nature cell biology·2026

Same author

Cross-Modal Graph Attention for Bridge SHM Data Imputation.

Sensors (Basel, Switzerland)·2026

Same author

Youth perceptions of urban waterfront environments for stress relief: a social media text analysis study in Beijing.

Frontiers in public health·2026

Same author

In Situ Polyurea Integration for Self-Healing, Durable Transparent Electromagnetic-Interference Shielding Film.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026

Same author

Modeling tumor transport and growth with poroelastic biopolymer networks.

Soft matter·2026

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 15, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

Large-Scale Model-Enhanced Vision-Language Navigation: Recent Advances, Practical Applications, and Future

Zecheng Li^1,2,3, Xiaolin Meng^1,2,3, Xu He^1,2,3

¹School of Instrument Science and Engineering, Southeast University, Nanjing 210096, China.

Sensors (Basel, Switzerland)

|April 14, 2026

Summary

This summary is machine-generated.

Large Language Models (LLMs) advance embodied AI by enabling robots to navigate 3D spaces using natural language. This review details LLM-driven Vision-Language Navigation (VLN) systems, focusing on challenges and future directions for real-world deployment.

Keywords:

edge deployment embodied intelligence large language models vision-language navigation

More Related Videos

Photorealistic Learned Landscapes for Augmented Reality

Photorealistic Learned Landscapes for Augmented Reality

Published on: June 27, 2025

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Related Experiment Videos

Last Updated: Apr 15, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

Photorealistic Learned Landscapes for Augmented Reality

Photorealistic Learned Landscapes for Augmented Reality

Published on: June 27, 2025

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Area of Science:

Artificial Intelligence (AI)
Embodied Cognition
Robotics

Background:

Vision-Language Navigation (VLN) aims for autonomous robots to navigate complex 3D environments using visual perception and natural language instructions.
Recent advancements leverage Large Language Models (LLMs) and Vision-Language Models (VLMs) for improved instruction interpretation, cross-modal alignment, and reasoning.
Existing surveys inadequately cover LLM-based VLN, particularly concerning Sim2Real transfer and edge deployment.

Purpose of the Study:

To provide a structured review of LLM-enabled Vision-Language Navigation (VLN) systems.
To analyze the evolution of VLN tasks, core components, challenges, and future research directions.
To address the gap in literature regarding LLM-based VLN for edge deployment and Sim2Real transfer.

Main Methods:

Structured review of LLM-enabled VLN, examining instruction understanding, environment perception, high-level planning, and low-level control.
Analysis of task evolution from path-following to goal-oriented and demand-driven navigation.
Summary of edge deployment requirements, datasets, and evaluation protocols.

Main Results:

LLM-based VLN methods show substantial improvements in instruction interpretation, cross-modal alignment, and reasoning-based planning.
Key challenges identified include reasoning complexity, spatial cognition, real-time efficiency, robustness, and Sim2Real adaptation.
The review covers essential components, datasets, evaluation, and deployment considerations for LLM-driven VLN.

Conclusions:

LLM-driven VLN is progressing towards deeper cognitive integration, enhancing explainability, generalizability, and deployability.
Future research should focus on knowledge-enhanced navigation, multimodal integration, and world-model-based frameworks for embodied AI.
This review provides a comprehensive overview and roadmap for LLM-enabled embodied navigation systems.