Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Purposive Learning

Purposive Learning

E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a...

Cognitive Learning

Cognitive Learning

Cognitive learning is based on purposive behavior, incidental learning, and insight learning.
E. C. Tolman's theory of purposive behavior emphasizes that much behavior is goal-directed. He argued that to understand behavior, we must look at the entire sequence of actions leading to a goal. For instance, high school students study hard, not just due to past reinforcement but also to achieve the goal of getting into a good college.
Tolman introduced the idea that behavior is influenced by...

Introduction to Learning

Introduction to Learning

Learning is the process of acquiring knowledge or skills through practice or experience, leading to long-lasting behavioral changes. This acquisition occurs through interaction with the environment and requires practice or experience. For instance, mastering a skill such as surfing requires considerable practice and experience, highlighting the essential role of repeated interactions with the environment in learning.
In contrast to learned behaviors, unlearned behaviors such as crying, sexual...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Multiple endocrine neoplasia type 1: atypical presentation, clinical course, and genetic analysis of multiple tumors.

Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc·1999

Same author

Effect of N-acetylcysteine on UVB-induced apoptosis and DNA repair in human and mouse keratinocytes.

Photochemistry and photobiology·1999

Same author

[Effect of pathogen-stimulated human CD4+ T cells on gamma delta T cells].

Zhongguo yi xue ke xue yuan xue bao. Acta Academiae Medicinae Sinicae·1999

Same author

[Assessment of nutrition in dialysis patients and chronic uremic patients].

Zhonghua nei ke za zhi·1999

Same author

[Clinical evaluation and immunomodulatory study of cefodizime].

Zhonghua nei ke za zhi·1999

Same author

[Effects of high frequency jet ventilation on respiratory airflow and gas exchange in dogs with inhalation injury].

Zhonghua zheng xing shao shang wai ke za zhi = Zhonghua zheng xing shao shang waikf [i.e. waike] zazhi = Chinese journal of plastic surgery and burns·1999

Same journal

A Model-Free Reinforcement Learning Implementation of Decision Making Under Uncertainty by Sequential Sampling.

Neural computation·2026

Same journal

DROP: Distributional and Regular Optimism and Pessimism for Reinforcement Learning.

Neural computation·2026

Same journal

Hierarchical Active Inference Using Successor Representations.

Neural computation·2026

Same journal

W-Kernel and Its Principal Space for Frequentist Evaluation of Bayesian Estimators.

Neural computation·2026

Same journal

A Hidden Markov Model-Inspired Sequence Classification Method for Hyperdimensional Computing.

Neural computation·2026

Same journal

Sparse Graphical Modeling for Electrophysiological Phase-Based Connectivity Using Circular Statistics.

Neural computation·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 20, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Learning Intention-Aware Policies in Deep Reinforcement Learning.

T Zhao¹, S Wu¹, G Li¹

¹College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin 300457, P.R.C. tingting@tust.edu.cn.

Neural Computation

|July 31, 2023

Summary

This summary is machine-generated.

This study introduces intention-aware policy learning for deep reinforcement learning (DRL) agents. The new method allows agents to incorporate human-like intentions into their decision-making for improved control.

More Related Videos

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

Published on: July 8, 2015

Related Experiment Videos

Last Updated: Jul 20, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

The Double-H Maze: A Robust Behavioral Test for Learning and Memory in Rodents

Published on: July 8, 2015

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Deep reinforcement learning (DRL) agents optimize policies based on state, memory, and parameters.
Human decision-making incorporates intentions (e.g., speed) beyond traditional DRL factors.

Purpose of the Study:

To develop an intention-aware policy learning method for DRL agents.
To enable agents to select actions that incorporate specific intentions, mimicking human behavior.

Main Methods:

Formalized an intention-aware policy by integrating intention information into the policy model.
Optimized the policy by maximizing cumulative rewards and mutual information (MI) between intention and action.
Derived an efficient approximation of the MI objective for practical implementation.

Main Results:

Demonstrated the effectiveness of the intention-aware policy in control tasks.
Showcased improved agent performance in classical MuJoCo and multigoal chain walking tasks.

Conclusions:

The proposed intention-aware policy learning method enhances DRL agents' decision-making capabilities.
Incorporating intentions makes agent actions more human-like and adaptable.