Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Depth Perception and Spatial Vision01:15

Depth Perception and Spatial Vision

2.7K
Depth perception is the ability to perceive objects three-dimensionally. It relies on two types of cues: binocular and monocular. Binocular cues depend on the combination of images from both eyes and how the eyes work together. Since the eyes are in slightly different positions, each eye captures a slightly different image. This disparity between images, known as binocular disparity, helps the brain interpret depth. When the brain compares these images, it determines the distance to an object.
2.7K
Vision01:24

Vision

48.5K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
48.5K
Visual Agnosia01:12

Visual Agnosia

2.0K
Visual agnosia is a condition characterized by the inability to recognize visually presented objects despite having normal vision. For instance, a person with visual agnosia can describe the shape and color of an object but cannot identify or name it. This impairment does not affect their visual field, acuity, color vision, brightness discrimination, language, or memory. An example of this condition in a social setting is someone at a dinner party asking for "that silver thing with a round...
2.0K
Perception01:28

Perception

1.8K
Perception is a fundamental psychological process that enables individuals to organize, interpret, and consciously experience sensory information. This process is crucial for understanding and interacting with the world around us. It includes both bottom-up and top-down processing, each playing a distinct role in how we perceive our environment.
Bottom-up processing begins at the sensory level, where receptors detect external environmental stimuli. These could include the tactile sensation of...
1.8K
Modeling and Similitude01:12

Modeling and Similitude

841
Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...
841
Gestalt Principles of Perception01:21

Gestalt Principles of Perception

1.8K
Gestalt principles provide a framework for understanding how humans perceive objects as unified wholes within their context. These principles are essential in explaining the cognitive processes that make sense of complex visual stimuli by organizing them into coherent groups. One fundamental principle is proximity, which posits that objects located close to each other are perceived as a collective group. For instance, when dots are positioned near one another, the visual system interprets them...
1.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Event-triggered fuzzy logic control for an uncertain robot with coupled output constraints.

ISA transactions·2026
Same author

Window-to-window BEV representation learning for limited FoV cross-view geo-localization.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

Nash Equilibrium Strategies for Multicluster Pursuit-Evasion Game With Disturbances: A Prescribed-Time Convergence Approach.

IEEE transactions on cybernetics·2026
Same author

Practical Prescribed-Time Cooperative Path Following of Underactuated Multi-ASVs Without Velocity Measurements via Intermittent Control.

IEEE transactions on cybernetics·2026
Same author

A modern look at simplicity bias in image classification tasks.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

Adaptive performance enhancement control for flexible-joint manipulator with model uncertainties and actuator failures.

ISA transactions·2025
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Apr 30, 2026

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind
09:01

Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

Published on: March 27, 2013

13.7K

ImagineNav++: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination.

Teng Wang, Xinxin Zhao, Wenzhe Cai

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |April 28, 2026
    PubMed
    Summary
    This summary is machine-generated.

    Autonomous robots can now navigate complex environments without maps using Vision-Language Models (VLMs). ImagineNav++ uses imagined future views for efficient robot navigation and planning, achieving state-of-the-art results.

    More Related Videos

    Author Spotlight: Enhancing Neurorehabilitation Through EEG, Motor Imagery, and Virtual Reality
    10:14

    Author Spotlight: Enhancing Neurorehabilitation Through EEG, Motor Imagery, and Virtual Reality

    Published on: May 10, 2024

    2.2K
    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function
    06:17

    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

    Published on: January 26, 2024

    2.7K

    Related Experiment Videos

    Last Updated: Apr 30, 2026

    Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind
    09:01

    Development of an Audio-based Virtual Gaming Environment to Assist with Navigation Skills in the Blind

    Published on: March 27, 2013

    13.7K
    Author Spotlight: Enhancing Neurorehabilitation Through EEG, Motor Imagery, and Virtual Reality
    10:14

    Author Spotlight: Enhancing Neurorehabilitation Through EEG, Motor Imagery, and Virtual Reality

    Published on: May 10, 2024

    2.2K
    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function
    06:17

    Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

    Published on: January 26, 2024

    2.7K

    Area of Science:

    • Robotics
    • Artificial Intelligence
    • Computer Vision

    Background:

    • Visual navigation is crucial for autonomous robots, especially in home-assistance tasks like object search.
    • Current Large Language Models (LLMs) struggle with spatial reasoning due to limitations in textual representations for navigation.
    • There's a need for methods that can effectively use visual data for robot navigation and planning.

    Purpose of the Study:

    • To investigate the potential of Vision-Language Models (VLMs) for mapless visual navigation using only onboard RGB/RGB-D streams.
    • To develop an imagination-powered navigation framework that enhances spatial perception and planning capabilities.
    • To overcome the limitations of text-based planning in current LLMs for robot navigation.

    Main Methods:

    • Developed ImagineNav++, an imagination-powered navigation framework for robots.
    • Introduced a future-view imagination module to generate high-exploration potential viewpoints.
    • Implemented a selective foveation memory mechanism for hierarchical integration of keyframe observations.
    • Transformed complex navigation into a best-view image selection problem for VLMs.

    Main Results:

    • ImagineNav++ achieved state-of-the-art performance in mapless visual navigation.
    • The framework surpassed most map-based methods in open-vocabulary object and instance navigation benchmarks.
    • Demonstrated the effectiveness of scene imagination and memory in VLM-based spatial reasoning.

    Conclusions:

    • VLMs can achieve effective mapless visual navigation by leveraging imagined future views and robust memory mechanisms.
    • ImagineNav++ offers a promising direction for enhancing robot autonomy and task execution in complex environments.
    • Scene imagination and memory are critical components for advanced VLM-based spatial reasoning in robotics.