Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Observational Learning01:12

Observational Learning

356
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
356
Purposive Learning01:22

Purposive Learning

215
E. C. Tolman emphasized the purposiveness of behavior — the idea that much of our behavior is goal-directed. For instance, employees who aim for a promotion work diligently to meet their targets. Tolman argued that when classical conditioning and operant conditioning occur, the organism acquires certain expectations. In classical conditioning, a child might fear a dog because they expect it to bite. In operant conditioning, a person might consistently work overtime because they expect a...
215
Cognitive Learning01:21

Cognitive Learning

692
Cognitive learning is based on purposive behavior, incidental learning, and insight learning.
E. C. Tolman's theory of purposive behavior emphasizes that much behavior is goal-directed. He argued that to understand behavior, we must look at the entire sequence of actions leading to a goal. For instance, high school students study hard, not just due to past reinforcement but also to achieve the goal of getting into a good college.
Tolman introduced the idea that behavior is influenced by...
692
Behaviorism01:28

Behaviorism

3.3K
The field of behaviorism was pioneered by figures such as Ivan Pavlov, John B. Watson, and B.F. Skinner fundamentally shifted the focus of psychology to the observable and controllable aspects of human and animal behavior. This shift marked a critical evolution in the discipline, emphasizing scientific rigor and experimental methodology.
The core premise of behaviorism is its focus on observable behavior rather than internal thoughts or feelings. This approach argues that true scientific...
3.3K
Avoidance Learning and Learned Helplessness01:14

Avoidance Learning and Learned Helplessness

1.9K
Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...
1.9K
Generalization, Discrimination, and Extinction01:24

Generalization, Discrimination, and Extinction

860
Generalization, discrimination, and extinction are key concepts in operant conditioning that influence how behaviors are learned and maintained.
Generalization occurs when a behavior reinforced in one context is performed in similar situations. For instance, a student who studies diligently for calculus and receives excellent grades might apply the same study habits to psychology and history, expecting similar results. Generalization shows how learning in one setting can influence behavior in...
860

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

[Expression of eosinophil major basic protein and neutrophil elastase in nasal polyp tissue and secretion].

Lin chuang er bi yan hou tou jing wai ke za zhi = Journal of clinical otorhinolaryngology head and neck surgery·2008
Same author

[Effect of interferon-gamma on the expression of vascular endothelial growth factor C on Hep-2 laryngeal carcinoma cell lines].

Lin chuang er bi yan hou tou jing wai ke za zhi = Journal of clinical otorhinolaryngology head and neck surgery·2008
Same author

Effects of 18alpha-glycyrrhizin on the pharmacodynamics and pharmacokinetics of glibenclamide in alloxan-induced diabetic rats.

European journal of pharmacology·2008
Same author

[Inhibition of oxidative activity of myeloperoxidase by anti-myeloperoxidase antibodies from patients with microscopic polyangiitis].

Beijing da xue xue bao. Yi xue ban = Journal of Peking University. Health sciences·2008
Same author

Gene delivery of indoleamine 2,3-dioxygenase prolongs cardiac allograft survival by shaping the types of T-cell responses.

The journal of gene medicine·2008
Same author

[Ultrasonographic findings of intussusception complicated by intestinal necrosis in children].

Zhongguo dang dai er ke za zhi = Chinese journal of contemporary pediatrics·2008
Same journal

Robust Semiglobal and Global Stabilization for Nonlinear Normal Form Systems by Time-Varying Feedback.

IEEE transactions on cybernetics·2026
Same journal

Adaptive Global Asymptotic Output Stabilization of Uncertain Nonlinear Systems Under Dynamic State/Input Quantization.

IEEE transactions on cybernetics·2026
Same journal

Accelerated Distributed Gradient Tracking for Constrained Aggregative Optimization Over Time-Varying Digraphs.

IEEE transactions on cybernetics·2026
Same journal

Small-Gain-Based Plug-and-Play Distributed Control Framework for DC Microgrids With Decentralized Reconfiguration.

IEEE transactions on cybernetics·2026
Same journal

Prescribed-Time Impulsive Control of High-Order Integrator Systems.

IEEE transactions on cybernetics·2026
Same journal

Relaxed Stability Conditions for Model Predictive Control of Hybrid Dynamical Systems Using Hybrid Recurrent Neural Networks.

IEEE transactions on cybernetics·2026
See all related articles

Related Experiment Video

Updated: Oct 1, 2025

Four Temporary Waterslide Designs Adapted to Different Slope Conditions to Encourage Child Socialization in Playgrounds
06:00

Four Temporary Waterslide Designs Adapted to Different Slope Conditions to Encourage Child Socialization in Playgrounds

Published on: December 9, 2022

2.1K

Policy Gradient From Demonstration and Curiosity.

Jie Chen, Wenjun Xu

    IEEE Transactions on Cybernetics
    |March 3, 2022
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a new reinforcement learning algorithm that improves exploration and learning from limited expert demonstrations. The method enhances agent performance in tasks with sparse rewards by integrating policy gradients with expert divergence and uncertainty estimation.

    More Related Videos

    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task
    11:18

    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

    Published on: June 1, 2015

    10.8K
    High-resolution Measurement of Odor-Driven Behavior in Drosophila Larvae
    29:23

    High-resolution Measurement of Odor-Driven Behavior in Drosophila Larvae

    Published on: January 3, 2008

    11.0K

    Related Experiment Videos

    Last Updated: Oct 1, 2025

    Four Temporary Waterslide Designs Adapted to Different Slope Conditions to Encourage Child Socialization in Playgrounds
    06:00

    Four Temporary Waterslide Designs Adapted to Different Slope Conditions to Encourage Child Socialization in Playgrounds

    Published on: December 9, 2022

    2.1K
    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task
    11:18

    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

    Published on: June 1, 2015

    10.8K
    High-resolution Measurement of Odor-Driven Behavior in Drosophila Larvae
    29:23

    High-resolution Measurement of Odor-Driven Behavior in Drosophila Larvae

    Published on: January 3, 2008

    11.0K

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Robotics

    Background:

    • Reinforcement learning (RL) agents learn complex behaviors from task abstractions.
    • Exploration and reward shaping are challenging in RL, especially with sparse extrinsic feedback.
    • Existing methods often require numerous high-quality expert demonstrations, which are difficult to obtain.

    Purpose of the Study:

    • To propose an integrated policy gradient algorithm for enhanced exploration and intrinsic reward learning.
    • To address the challenge of learning from a limited number of expert demonstrations in RL.
    • To improve agent performance in environments with sparse reward signals.

    Main Methods:

    • Developed an integrated policy gradient algorithm.
    • Reformulated the reward function with Jensen-Shannon divergence between policy and expert demonstrations.
    • Incorporated an agent's environmental uncertainty estimation into the reward function.
    • Evaluated the algorithm on simulated tasks with sparse rewards and limited demonstrations.

    Main Results:

    • Demonstrated superior exploration efficiency across all tested tasks.
    • Achieved high average returns in environments with sparse extrinsic rewards.
    • Showcased the agent's ability to imitate expert behavior effectively.
    • Validated the algorithm's performance with limited expert trajectories.

    Conclusions:

    • The proposed algorithm effectively boosts exploration and intrinsic reward learning in RL.
    • Limited expert demonstrations can be leveraged for improved agent performance.
    • The method balances imitation of expert behavior with maintaining high returns.