Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Reinforcement Schedules01:24

Reinforcement Schedules

147
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
147
Instinctive Drift01:05

Instinctive Drift

221
Instinctive drift refers to the tendency of animals to revert to their innate behaviors despite repeated reinforcement. Breland and Breland demonstrated this concept in an experiment with a raccoon. The raccoon was trained to pick up two coins and place them in a container in exchange for food. Initially, the raccoon learned to associate the coins with food, making them a conditioned stimulus or a substitute for food. However, over time, the raccoon became less willing to put the coins into the...
221
Generalization, Discrimination, and Extinction01:24

Generalization, Discrimination, and Extinction

557
Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...
557
Associative Learning01:27

Associative Learning

362
Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...
362
Randomized Experiments01:13

Randomized Experiments

6.9K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
6.9K
Cognitive Learning01:21

Cognitive Learning

243
Cognitive learning is based on purposive behavior, incidental learning, and insight learning.
E. C. Tolman's theory of purposive behavior emphasizes that much behavior is goal-directed. He argued that to understand behavior, we must look at the entire sequence of actions leading to a goal. For instance, high school students study hard, not just due to past reinforcement but also to achieve the goal of getting into a good college.
Tolman introduced the idea that behavior is influenced by...
243

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Offline constrained policy optimization with safe anchoring.

Neural networks : the official journal of the International Neural Network Society·2026
Same author

Spatiotemporal evolution and trade-offs/synergies of ecosystem services in Hubei Province.

Scientific reports·2025
Same author

Measuring the resilience of mountain city ecological network: a methodological framework integrating real disaster shocks and simulated disturbance scenarios.

Journal of environmental management·2025
Same author

Did green infrastructure improve water purification ecosystem services in Shandong Peninsula urban agglomeration? Evidence from total phosphorus.

Journal of environmental management·2024
Same author

Historical Decision-Making Regularized Maximum Entropy Reinforcement Learning.

IEEE transactions on neural networks and learning systems·2024
Same author

Retraction Note: Changes in ecological networks and eco-environmental effects on urban ecosystem in China's typical urban agglomerations.

Environmental science and pollution research international·2024
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Jul 3, 2025

Operant Procedures for Assessing Behavioral Flexibility in Rats
08:30

Operant Procedures for Assessing Behavioral Flexibility in Rats

Published on: February 15, 2015

20.9K

Efficient Offline Reinforcement Learning With Relaxed Conservatism.

Longyang Huang, Botao Dong, Weidong Zhang

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |February 12, 2024
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a new offline reinforcement learning (RL) framework, ORL-RC, to address conservatism issues. ORL-RC learns a Q-function closer to the true Q-function, improving policy performance and outperforming existing methods.

    More Related Videos

    Extinction Training During the Reconsolidation Window Prevents Recovery of Fear
    11:17

    Extinction Training During the Reconsolidation Window Prevents Recovery of Fear

    Published on: August 24, 2012

    35.3K
    A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers
    08:05

    A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

    Published on: January 5, 2018

    9.8K

    Related Experiment Videos

    Last Updated: Jul 3, 2025

    Operant Procedures for Assessing Behavioral Flexibility in Rats
    08:30

    Operant Procedures for Assessing Behavioral Flexibility in Rats

    Published on: February 15, 2015

    20.9K
    Extinction Training During the Reconsolidation Window Prevents Recovery of Fear
    11:17

    Extinction Training During the Reconsolidation Window Prevents Recovery of Fear

    Published on: August 24, 2012

    35.3K
    A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers
    08:05

    A Prediction Error-driven Retrieval Procedure for Destabilizing and Rewriting Maladaptive Reward Memories in Hazardous Drinkers

    Published on: January 5, 2018

    9.8K

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Robotics

    Background:

    • Offline reinforcement learning (RL) aims to learn optimal policies from static datasets without environmental interaction.
    • Existing offline RL methods face challenges with conservatism in learned Q-functions and policies, potentially degrading performance.
    • Theoretical understanding of offline RL conservatism requires further investigation.

    Purpose of the Study:

    • To propose a simple and efficient offline RL framework with relaxed conservatism (ORL-RC).
    • To analyze the conservatism of learned Q-functions and policies in offline RL.
    • To theoretically establish convergence and bounds for the proposed ORL-RC framework.

    Main Methods:

    • Developed the offline RL with relaxed conservatism (ORL-RC) framework.
    • Analyzed the conservatism of Q-functions and policies in offline RL.
    • Established theoretical convergence results and bounds for learned Q-functions, considering sampling errors.

    Main Results:

    • Demonstrated that conservatism in offline RL can lead to policy performance degradation.
    • The proposed ORL-RC framework learns a Q-function closer to the true Q-function.
    • Experimental results on the D4RL benchmark show ORL-RC outperforms state-of-the-art offline RL methods.

    Conclusions:

    • ORL-RC effectively addresses conservatism issues in offline RL.
    • The framework offers improved Q-function approximation and policy performance.
    • ORL-RC represents a significant advancement in offline reinforcement learning.