Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Purposive Learning

Purposive Learning

E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a...

Law of Effect

Law of Effect

B.F. Skinner, a prominent figure in behavioral psychology, introduced operant conditioning by emphasizing the role of consequences in shaping behavior. This theory builds upon the law of effect proposed by Edward Thorndike, which posits that behaviors followed by satisfying outcomes are likely to be repeated. In contrast, those followed by unsatisfying outcomes are less likely to recur.
Edward Thorndike's foundational work involved studying learning in animals, particularly using puzzle...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Operant Conditioning Intervention

Operant Conditioning Intervention

Operant conditioning serves as a foundational principle in therapeutic interventions aimed at modifying maladaptive behaviors. Central to this approach is the notion that behaviors, both adaptive and maladaptive, are learned through reinforcement. By analyzing the environmental factors that reinforce problematic behaviors, clinicians can design interventions to weaken these reinforcements and replace maladaptive behaviors with healthier alternatives.
In operant conditioning, behaviors that are...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Individual differences in visuo-spatial working memory capacity and prior knowledge during interrupted reading.

Frontiers in cognition·2026

Same author

Tell Me Without Telling Me: Two-Way Prediction of Visualization Literacy and Visual Attention.

IEEE transactions on visualization and computer graphics·2025

Same author

Safety and efficacy of 6% hydroxyethyl starch in patients undergoing major surgery: The randomised controlled PHOENICS trial.

European journal of anaesthesiology·2025

Same author

Method for generating right ventricular pressure-volume loops in routine practice.

The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation·2025

Same author

iAssistADL: Intelligent Assistive Device for Patients with Neurodegenerative Movement Disorder: Concepts and First Implementations.

IEEE ... International Conference on Rehabilitation Robotics : [proceedings]·2025

Same author

HaHeAE: Learning Generalisable Joint Representations of Human Hand and Head Movements in Extended Reality.

IEEE transactions on visualization and computer graphics·2025

Same journal

Supporting human-agent communication for explainable planning in spatial-temporal planning problems.

Neural computing & applications·2026

Same journal

Contrastive learning-based video quality assessment-jointed video vision transformer for video recognition.

Neural computing & applications·2026

Same journal

Sequential pattern transformer (SPT): a generative and interpretable framework for predicting disease trajectories.

Neural computing & applications·2026

Same journal

Balancing misclassification errors in image-based inference using problem domain semantics and a nested cascade architecture.

Neural computing & applications·2025

Same journal

Deep multi-objective reinforcement learning for utility-based infrastructural maintenance optimization.

Neural computing & applications·2025

Same journal

A fairness scale for real-time recidivism forecasts using a national database of convicted offenders.

Neural computing & applications·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 13, 2025

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

Published on: July 8, 2015

Int-HRL: towards intention-based hierarchical reinforcement learning.

Anna Penzkofer¹, Simon Schaefer², Florian Strohm¹

¹Institute for Visualisation and Interactive Systems, University of Stuttgart, Pfaffenwaldring 5A, 70569 Stuttgart, Germany.

Neural Computing & Applications

|August 4, 2025

Summary

This summary is machine-generated.

This study introduces Int-HRL, a new hierarchical reinforcement learning (RL) method. By using human eye gaze to predict intentions, it automatically creates sub-goals, improving sample efficiency in challenging RL tasks.

Keywords:

Eye gaze Hierarchical reinforcement learning Intention prediction Montezuma’s revenge Sub-goal extraction

More Related Videos

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

Related Experiment Videos

Last Updated: Sep 13, 2025

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

Published on: July 8, 2015

Pavlovian Conditioned Approach Training in Rats

Pavlovian Conditioned Approach Training in Rats

Published on: February 4, 2016

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Deep reinforcement learning (RL) agents excel at tasks but require vast data for training.
Hierarchical RL (HRL) improves sample efficiency using structural information but relies on human-annotated sub-goals.
Discovering effective sub-goals is a major challenge in HRL for complex, long-horizon tasks.

Purpose of the Study:

To develop a novel HRL method that reduces the need for human-annotated sub-goals.
To leverage human intention prediction from eye gaze for automated sub-goal generation.
To enhance sample efficiency in challenging RL environments like Montezuma's Revenge.

Main Methods:

Predicting human player intentions from eye gaze data.
Developing an automatic sub-goal extraction pipeline based on predicted intentions.
Implementing Intention-based Hierarchical Reinforcement Learning (Int-HRL).

Main Results:

Human intentions can be robustly predicted from eye gaze in long-horizon, sparse-reward tasks.
The proposed automatic sub-goal extraction pipeline effectively replaces manual annotation.
Int-HRL demonstrates significantly improved sample efficiency compared to previous HRL methods.

Conclusions:

Eye gaze-based intention prediction offers a viable alternative to manual sub-goal annotation in HRL.
Int-HRL significantly enhances sample efficiency, making complex RL tasks more tractable.
This approach paves the way for more autonomous and efficient learning agents.