Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Instinctive Drift

Instinctive Drift

Instinctive drift refers to the tendency of animals to revert to their innate behaviors despite repeated reinforcement. Breland and Breland demonstrated this concept in an experiment with a raccoon. The raccoon was trained to pick up two coins and place them in a container in exchange for food. Initially, the raccoon learned to associate the coins with food, making them a conditioned stimulus or a substitute for food. However, over time, the raccoon became less willing to put the coins into the...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Generalization, Discrimination, and Extinction

Generalization, Discrimination, and Extinction

Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Integrated Single-Cell and Bulk RNA Sequencing Identifies Macrophage Heterogeneity and Mitophagy-Related Biomarkers in Idiopathic Pulmonary Fibrosis.

International journal of molecular sciences·2026

Same author

Treatment patterns and survival outcomes in pancreatic ductal adenocarcinoma: a large-scale population-based retrospective cohort study.

International journal of surgery (London, England)·2026

Same author

DyDiT++: Diffusion Transformers With Timestep and Spatial Dynamics for Efficient Visual Generation.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Predictive value of current nodal staging systems and development of machine learning nomogram for resectable pancreatic head cancer: a population-based study and multicenter validation.

Frontiers in immunology·2025

Same author

Immune-related long noncoding RNAs in predicting the prognosis and immune landscape of intrahepatic cholangiocarcinoma: a bioinformatics analysis with experimental verification.

Translational cancer research·2025

Same author

AdaGen: Learning Adaptive Policy for Image Synthesis.

IEEE transactions on pattern analysis and machine intelligence·2025

Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026

Same journal

Self-Supervised Continuous Dynamic Graph Representation Learning via Hawkes Processes.

IEEE transactions on neural networks and learning systems·2026

Same journal

cPU: Consistent Risk Estimator for Positive-Unlabeled Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Tuning-Free Latent Diffusion Models for Ultrahigh-Resolution Image Editing.

IEEE transactions on neural networks and learning systems·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 22, 2025

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation.

Hangkai Hu, Gao Huang, Xiang Li

IEEE Transactions on Neural Networks and Learning Systems

|August 31, 2021

Summary

This summary is machine-generated.

This study introduces a novel off-policy meta-reinforcement learning (meta-RL) algorithm to improve sample efficiency and task adaptation. The method enhances meta-learners' ability to balance general and task-specific knowledge for better performance.

Related Experiment Videos

Last Updated: Oct 22, 2025

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Deep reinforcement learning (RL) faces challenges in sample efficiency and task migration.
Meta-reinforcement learning (meta-RL) aims to improve adaptation to new tasks by leveraging prior experience.
Current meta-RL methods struggle to effectively integrate task-agnostic data exploitation with task-specific latent context.

Purpose of the Study:

To develop an off-policy meta-RL algorithm that enhances meta-learners' self-awareness regarding task adaptation.
To improve the effectiveness and generalization ability of meta-RL agents.
To enable meta-learners to better balance general learning strategies with task-specific information.

Main Methods:

Developed an off-policy meta-RL algorithm incorporating dynamic task-adaptiveness distillation.
Implemented latent context reorganization to balance task-agnostic and task-related information.
Evaluated the algorithm's performance against existing methods like Probabilistic Embeddings for Actor-Critic RL (PEARL).

Main Results:

The proposed algorithm demonstrated improved adaptation capabilities in meta-learners.
Dynamic task-adaptiveness distillation effectively guided the adjustment of exploration strategies during meta-training.
Experimental results showed a 10%-20% higher asymptotic reward compared to PEARL.

Conclusions:

The novel meta-RL approach enhances self-oriented cognition in meta-learners for improved task adaptation.
Balancing general and specific knowledge through latent context reorganization is crucial for meta-RL effectiveness.
The algorithm offers a significant advancement in meta-RL performance and generalization.