Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Design Example: Identifying the Locations of Monuments in the Field Using Global Positioning System Device

Design Example: Identifying the Locations of Monuments in the Field Using Global Positioning System Device

Surveyors use Global Positioning System (GPS) technology to measure the precise location and elevation of points on Earth. In a recent survey, GPS receivers were used to determine the coordinates and elevations of two park monuments. The process involved careful mission planning, data collection, and correction to ensure accuracy. The survey began with mission planning to identify optimal satellite visibility and minimize Position Dilution of Precision (PDOP). A geodetic control point served as...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

EvolveNav: Empowering LLM-Based Vision-Language Navigation via Self-Improving Embodied Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Structural insights into cationic amino acid transport and viral receptor engagement by CAT1.

Nature communications·2025

Same author

Brown adipose tissue activation and cardiovascular risk following PD-1 antibody therapy in cancer patients: a retrospective cohort study.

European journal of medical research·2025

Same author

Unseen From Seen: Rewriting Observation-Instruction Using Foundation Models for Augmenting Vision-Language Navigation.

IEEE transactions on neural networks and learning systems·2025

Same author

Whole-genome methylation profiling of extracellular vesicle DNA in gastric cancer identifies intercellular communication features.

Nature communications·2025

Same journal

TraGraph-GS: Trajectory Graph-based Gaussian Splatting for Arbitrary Large-Scale Scene Rendering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

SWIFT: A Small-World Interaction Framework for Flow-Aware Trajectory Prediction in Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

HardFlow: Hard-Constrained Sampling for Flow-Matching Models Via Trajectory Optimization.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Industrial Brain: Self-Evolving Neuro-Symbolic Autonomy with Causal Resilience for Cyber-Physical Systems.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Adaptive Hardness-Driven Dictionary Distillation for Incomplete Streaming View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 27, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

Correctable Landmark Discovery via Large Models for Vision-Language Navigation.

Bingqian Lin, Yunshuang Nie, Ziming Wei

IEEE Transactions on Pattern Analysis and Machine Intelligence

|May 31, 2024

Summary

This summary is machine-generated.

This study introduces CONSOLE, a new Vision-Language Navigation (VLN) approach using large language models for better landmark discovery. CONSOLE improves navigation in unexplored areas by correcting landmark alignment with real-world observations.

More Related Videos

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Published on: January 26, 2024

Related Experiment Videos

Last Updated: Jun 27, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Published on: January 26, 2024

Area of Science:

Artificial Intelligence
Robotics
Computer Vision

Background:

Vision-Language Navigation (VLN) agents struggle with accurate modality alignment, especially in novel environments due to limited training data.
Existing VLN methods lack sufficient open-world knowledge for aligning linguistic landmarks with visual observations.

Purpose of the Study:

To propose a novel VLN paradigm, COrrectable LaNdmark DiScOvery via Large ModEls (CONSOLE), for improved landmark discovery and navigation.
To enhance VLN agents' ability to align language instructions with visual observations in unexplored scenes.

Main Methods:

CONSOLE frames VLN as an open-world sequential landmark discovery problem.
It leverages ChatGPT for commonsense co-occurrence knowledge and CLIP for landmark discovery, guided by these priors.
A learnable cooccurrence scoring module corrects prior noise using actual observations, and an observation enhancement strategy integrates corrected landmark features for action decisions.

Main Results:

CONSOLE demonstrates significant superiority over strong baselines across multiple VLN benchmarks (R2R, REVERIE, R4R, RxR).
The framework establishes new state-of-the-art results on the R2R and R4R benchmarks, particularly in unseen scenarios.
CONSOLE effectively improves landmark discovery and modality alignment in challenging, unexplored environments.

Conclusions:

CONSOLE offers a powerful new paradigm for Vision-Language Navigation by integrating large language models for enhanced landmark discovery.
The proposed correctable landmark discovery scheme and observation enhancement strategy significantly boost VLN performance, especially in complex and unseen environments.
This work paves the way for more robust and adaptable VLN agents capable of navigating diverse and open-world settings.