Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

2.7K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
2.7K
Vision01:24

Vision

61.7K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
61.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Machine learning early warning for urban heat risk with CMIP6 projections.

Journal of environmental management·2026
Same author

Author Correction: Long-term, in toto live imaging of cardiomyocyte behaviour during mouse ventricle chamber formation at single-cell resolution.

Nature cell biology·2026
Same author

Cross-Modal Graph Attention for Bridge SHM Data Imputation.

Sensors (Basel, Switzerland)·2026
Same author

Youth perceptions of urban waterfront environments for stress relief: a social media text analysis study in Beijing.

Frontiers in public health·2026
Same author

In Situ Polyurea Integration for Self-Healing, Durable Transparent Electromagnetic-Interference Shielding Film.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026
Same author

Modeling tumor transport and growth with poroelastic biopolymer networks.

Soft matter·2026
Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026
Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026
Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026
Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026
Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026
Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026
See all related articles

Related Experiment Video

Updated: Apr 15, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind
09:01

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

15.1K

Large-Scale Model-Enhanced Vision-Language Navigation: Recent Advances, Practical Applications, and Future

Zecheng Li1,2,3, Xiaolin Meng1,2,3, Xu He1,2,3

  • 1School of Instrument Science and Engineering, Southeast University, Nanjing 210096, China.

Sensors (Basel, Switzerland)
|April 14, 2026
PubMed
Summary
This summary is machine-generated.

Large Language Models (LLMs) advance embodied AI by enabling robots to navigate 3D spaces using natural language. This review details LLM-driven Vision-Language Navigation (VLN) systems, focusing on challenges and future directions for real-world deployment.

Keywords:
edge deploymentembodied intelligencelarge language modelsvision-language navigation

More Related Videos

Photorealistic Learned Landscapes for Augmented Reality
06:54

Photorealistic Learned Landscapes for Augmented Reality

Published on: June 27, 2025

897
A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants
06:28

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

6.4K

Related Experiment Videos

Last Updated: Apr 15, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind
09:01

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

15.1K
Photorealistic Learned Landscapes for Augmented Reality
06:54

Photorealistic Learned Landscapes for Augmented Reality

Published on: June 27, 2025

897
A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants
06:28

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

6.4K

Area of Science:

  • Artificial Intelligence (AI)
  • Embodied Cognition
  • Robotics

Background:

  • Vision-Language Navigation (VLN) aims for autonomous robots to navigate complex 3D environments using visual perception and natural language instructions.
  • Recent advancements leverage Large Language Models (LLMs) and Vision-Language Models (VLMs) for improved instruction interpretation, cross-modal alignment, and reasoning.
  • Existing surveys inadequately cover LLM-based VLN, particularly concerning Sim2Real transfer and edge deployment.

Purpose of the Study:

  • To provide a structured review of LLM-enabled Vision-Language Navigation (VLN) systems.
  • To analyze the evolution of VLN tasks, core components, challenges, and future research directions.
  • To address the gap in literature regarding LLM-based VLN for edge deployment and Sim2Real transfer.

Main Methods:

  • Structured review of LLM-enabled VLN, examining instruction understanding, environment perception, high-level planning, and low-level control.
  • Analysis of task evolution from path-following to goal-oriented and demand-driven navigation.
  • Summary of edge deployment requirements, datasets, and evaluation protocols.

Main Results:

  • LLM-based VLN methods show substantial improvements in instruction interpretation, cross-modal alignment, and reasoning-based planning.
  • Key challenges identified include reasoning complexity, spatial cognition, real-time efficiency, robustness, and Sim2Real adaptation.
  • The review covers essential components, datasets, evaluation, and deployment considerations for LLM-driven VLN.

Conclusions:

  • LLM-driven VLN is progressing towards deeper cognitive integration, enhancing explainability, generalizability, and deployability.
  • Future research should focus on knowledge-enhanced navigation, multimodal integration, and world-model-based frameworks for embodied AI.
  • This review provides a comprehensive overview and roadmap for LLM-enabled embodied navigation systems.