Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Generalization, Discrimination, and Extinction

Generalization, Discrimination, and Extinction

Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...

Introduction to Learning

Introduction to Learning

Learning is the process of acquiring knowledge or skills through practice or experience, leading to long-lasting behavioral changes. This acquisition occurs through interaction with the environment and requires practice or experience. For instance, mastering a skill such as surfing requires considerable practice and experience, highlighting the essential role of repeated interactions with the environment in learning.
In contrast to learned behaviors, unlearned behaviors such as crying, sexual...

Purposive Learning

Purposive Learning

E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a bonus...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Saliency Response in Superior Colliculus at the Future Saccade Goal Predicts Fixation Duration during Free Viewing of Dynamic Scenes.

The Journal of neuroscience : the official journal of the Society for Neuroscience·2024

Same author

Expert-level sleep staging using an electrocardiography-only feed-forward neural network.

Computers in biology and medicine·2024

Same author

Ferroelectric FET-based context-switching FPGA enabling dynamic reconfiguration for adaptive deep learning machines.

Science advances·2024

Same author

Eye tracking identifies biomarkers in α-synucleinopathies versus progressive supranuclear palsy.

Journal of neurology·2022

Same author

Pupillary responses to differences in luminance, color and set size.

Experimental brain research·2022

Same author

Rapid adaptation of brain-computer interfaces to new neuronal ensembles or participants via generative modelling.

Nature biomedical engineering·2021

Same journal

Using Robotics to Improve Transcatheter Edge-to-Edge Repair of the Mitral Valve.

IEEE robotics and automation letters·2026

Same journal

Continuum Robot Segments with High Output Stiffness via Diagonal Backbones.

IEEE robotics and automation letters·2026

Same journal

Friction Modeling of Tendon-driven Continuum Robots through Linear Complementarity Problem.

IEEE robotics and automation letters·2026

Same journal

Efficient and Scalable Tuning of Continuous Impedance Control for Powered Knee Prostheses.

IEEE robotics and automation letters·2026

Same journal

Validation of Dynamic Bayesian Optimization for a Non-Stationary Human-in-the-Loop Optimization Problem.

IEEE robotics and automation letters·2026

Same journal

Error-State Model Predictive Path Integral Control of Tendon-Driven Continuum Robots using Cosserat Rod Dynamics with Strain Parametrization.

IEEE robotics and automation letters·2026

See all related articles

Search research articles

Related Experiment Videos

Value Explicit Pretraining for Learning Transferable Representations.

Kiran Lekkala¹, Henghui Bao¹, Sumedh A Sontakke¹

¹Thomas Lord Department of Computer Science at the University of Southern California.

IEEE Robotics and Automation Letters

|June 15, 2026

Summary

This summary is machine-generated.

Value Explicit Pretraining (VEP) improves reinforcement learning by using suboptimal data to create generalizable visual representations. This method enhances transfer learning, enabling faster adaptation to new tasks with better rewards and sample efficiency.

Related Experiment Videos

Area of Science:

Artificial Intelligence
Machine Learning
Computer Vision

Background:

Reinforcement learning agents struggle with understanding visual inputs amid environmental changes.
Generalizable representations are crucial for efficient transfer learning in dynamic environments.

Purpose of the Study:

To introduce Value Explicit Pretraining (VEP), a novel method for learning generalizable representations in transfer reinforcement learning.
To enable efficient adaptation to new tasks by learning representations invariant to environmental dynamics and appearance variations.

Main Methods:

VEP utilizes suboptimal, unlabeled demonstration data (observations and sparse rewards) for pretraining an encoder.
A self-supervised contrastive loss is employed, relating states across tasks using Monte Carlo value estimates to reflect task progress.
This results in temporally smooth representations that capture task objectives.

Main Results:

VEP outperforms state-of-the-art pretraining methods in generalizing to unseen tasks.
Experiments on Ant locomotion, a navigation simulator, and Atari benchmarks demonstrate significant improvements.
VEP achieved up to a 2x increase in rewards and up to a 3x improvement in sample efficiency.

Conclusions:

VEP effectively learns generalizable representations from suboptimal data for transfer reinforcement learning.
The method shows superior performance in adapting to new tasks compared to existing approaches.
VEP offers a promising direction for enhancing the efficiency and adaptability of reinforcement learning agents.