Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Vision01:24

Vision

48.5K
Vision is the result of light being detected and transduced into neural signals by the retina of the eye. This information is then further analyzed and interpreted by the brain. First, light enters the front of the eye and is focused by the cornea and lens onto the retina—a thin sheet of neural tissue lining the back of the eye. Because of refraction through the convex lens of the eye, images are projected onto the retina upside-down and reversed.
48.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

SCMBench: benchmarking domain-specific and foundation models for single-cell multi-omics data integration.

Nature communications·2026
Same author

HC-GLAD: Dual hyperbolic contrastive learning for unsupervised graph-level anomaly detection.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

DualGPT-AB: a dual-stage generative optimization framework for therapeutic antibody design.

Nature computational science·2026
Same author

Recent Advances of Multimodal Continual Learning: A Comprehensive Survey.

IEEE transactions on neural networks and learning systems·2026
Same author

Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey.

IEEE transactions on pattern analysis and machine intelligence·2026
Same author

Learning Optimal Policies With Local Observations for Cooperative Multiagent Reinforcement Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Apr 30, 2026

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
07:14

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

Published on: December 23, 2025

1.1K

A Survey on Vision-Language-Action Models for Embodied AI.

Yueen Ma, Zixing Song, Yuzheng Zhuang

    IEEE Transactions on Neural Networks and Learning Systems
    |April 28, 2026
    PubMed
    Summary
    This summary is machine-generated.

    This survey introduces vision-language-action (VLA) models, crucial for embodied artificial intelligence (AI) and artificial general intelligence (AGI). It categorizes VLA research and resources, outlining future directions for AI agents in the physical world.

    More Related Videos

    Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses
    05:21

    Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses

    Published on: January 7, 2019

    7.6K

    Related Experiment Videos

    Last Updated: Apr 30, 2026

    Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
    07:14

    Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

    Published on: December 23, 2025

    1.1K
    Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses
    05:21

    Characterization of the Sense of Agency over the Actions of Neural-machine Interface-operated Prostheses

    Published on: January 7, 2019

    7.6K

    Area of Science:

    • Robotics
    • Artificial Intelligence
    • Computer Vision

    Background:

    • Embodied AI is key to artificial general intelligence (AGI), requiring control of physical agents.
    • Large language models (LLMs) and vision-language models (VLMs) have advanced AI capabilities.
    • Vision-language-action (VLA) models bridge language understanding with physical action generation for embodied AI tasks.

    Purpose of the Study:

    • To provide the first comprehensive survey of vision-language-action (VLA) models in embodied AI.
    • To establish a taxonomy for the rapidly evolving field of VLA models.
    • To consolidate resources and identify future research trajectories.

    Main Methods:

    • Categorization of VLA research into three main areas: individual components, VLA-based control policies for low-level actions, and high-level task planners.
    • Compilation of datasets, simulators, and benchmarks relevant to VLA research.
    • Analysis of current challenges and future directions in embodied AI.

    Main Results:

    • A structured taxonomy detailing the landscape of VLA models.
    • Identification of VLA applications in predicting actions and decomposing complex tasks.
    • A comprehensive overview of essential resources for VLA research and development.

    Conclusions:

    • VLAs are a critical advancement for embodied AI, enabling robots to perform complex, language-conditioned tasks.
    • The survey provides a foundational understanding and roadmap for future research in VLA models and embodied AI.
    • Continued development in VLA models promises significant progress towards artificial general intelligence.