Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Optimal Foraging

Optimal Foraging

How animals obtain and eat their food is called foraging behavior. Foraging can include searching for plants and hunting for prey and depends on the species and environment.

Actor-Observer Effect

Actor-Observer Effect

The actor-observer effect, a cognitive bias closely linked to the fundamental attribution error, refers to the tendency for individuals to attribute their behavior to external, situational factors while explaining others’ behavior in terms of internal, dispositional traits. This asymmetry in attribution significantly influences social perception and judgment.Cognitive Mechanisms Behind the EffectTwo primary psychological mechanisms contribute to the actor-observer effect: differences in...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Self-Evaluation: Self-Enhancement and Self-Verification

Self-Evaluation: Self-Enhancement and Self-Verification

Social psychologists have documented that feeling good about ourselves and maintaining positive self-esteem is a powerful motivator of human behavior (Tavris & Aronson, 2008). In the United States, members of the predominant culture typically think very highly of themselves and view themselves as good people who are above average on many desirable traits (Ehrlinger, Gilovich, & Ross, 2005). Often, our behavior, attitudes, and beliefs are affected when we experience a threat to our...

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

Naturalistic Observations

Naturalistic Observations

If you want to understand how behavior occurs, one of the best ways to gain information is to simply observe the behavior in its natural context. However, people might change their behavior in unexpected ways if they know they are being observed. How do researchers obtain accurate information when people tend to hide their natural behavior? As an example, imagine that your professor asks everyone in your class to raise their hand if they always wash their hands after using the restroom. Chances...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Investigation of the molecular network underlying PET-MPs-induced inflammatory bowel disease via integrated machine learning and molecular docking approaches.

Scientific reports·2026

Same author

DNA-contact mutant p53 displaces BRCA2 from chromatin and drives R-loop-associated genome instability.

Genome biology·2026

Same author

Polysaccharide-protein composite systems in dysphagia-oriented food design: interactions, applications, and additive manufacturing.

Food chemistry·2026

Same author

Bioinspired MXene@PNIPAAm Composite Enables Switchable Microwave Absorption.

ACS applied materials & interfaces·2026

Same author

The role of the <i>MfSWN1</i> in secondary wall growth and development.

Frontiers in plant science·2026

Same author

A novel KCTD17 mutation in a Chinese family associated with myoclonus dystonia.

Parkinsonism & related disorders·2026

Same journal

An Evolutionary Algorithm Assisted by an Ensemble of Pareto-Optimal Surrogate Models.

IEEE transactions on cybernetics·2026

Same journal

A Quantum Self-Attention Neural Network Model on Quantum Circuits.

IEEE transactions on cybernetics·2026

Same journal

Semi-Explicit Solution of Some Discrete-Time Higher-Order-Cost Mean-Field-Type Control.

IEEE transactions on cybernetics·2026

Same journal

A Novel One-Step Small Object Detector for Autonomous Aerial Vehicles.

IEEE transactions on cybernetics·2026

Same journal

Online Data-Driven-Based Optimal Output Tracking Control Without Initial Stabilizing Policy.

IEEE transactions on cybernetics·2026

Same journal

Digital Redesign-Based Interval State Estimation for Continuous Systems With Aperiodic Discrete Measurements.

IEEE transactions on cybernetics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 9, 2026

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Published on: January 26, 2024

Enhancing Exploration in Actor-Critic Algorithms: An Approach to Incentivize Plausible Novel States.

Chayan Banerjee, Zhiyong Chen, Nasimul Noman

IEEE Transactions on Cybernetics

|December 9, 2025

Summary

This summary is machine-generated.

Actor-critic (AC) algorithms improve exploration using plausible novelty, an intrinsic reward for exploring states with high potential learning benefits. This enhances sample efficiency and training performance in deep reinforcement learning.

More Related Videos

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Published on: August 4, 2023

Novel Object Exploration as a Potential Assay for Higher Order Repetitive Behaviors in Mice

Novel Object Exploration as a Potential Assay for Higher Order Repetitive Behaviors in Mice

Published on: August 20, 2016

Related Experiment Videos

Last Updated: Jan 9, 2026

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Author Spotlight: Investigating the Effects of Mind-Body-Movement Practices on Brain Function

Published on: January 26, 2024

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Author Spotlight: A Novel Setup to Conduct Naturalistic Laboratory Experiments with Real Human Actors in Scenarios

Published on: August 4, 2023

Novel Object Exploration as a Potential Assay for Higher Order Repetitive Behaviors in Mice

Novel Object Exploration as a Potential Assay for Higher Order Repetitive Behaviors in Mice

Published on: August 20, 2016

Area of Science:

Artificial Intelligence
Machine Learning
Deep Reinforcement Learning

Background:

Actor-critic (AC) algorithms are effective model-free deep reinforcement learning methods.
Efficient sample utilization for exploration and exploitation is crucial for AC algorithm success.
Current methods often fail to quantify the utility of novel states for policy learning, leading to inefficient exploration.

Purpose of the Study:

To introduce an intrinsic reward mechanism, termed plausible novelty, to enhance exploration in AC algorithms.
To improve sample efficiency and overall training performance by incentivizing the exploration of states with high potential learning benefits.

Main Methods:

Developed an intrinsic reward signal based on state novelty and its potential utility for policy learning.
Integrated the plausible novelty reward into off-policy actor-critic algorithms.
Evaluated the proposed method on benchmark deep reinforcement learning environments.

Main Results:

The proposed method demonstrated substantial improvements in sample efficiency.
Achieved an average of 19% improvement in training return across multiple environments and algorithms.
Showcased a 30% reduction in standard deviation, indicating more stable training.

Conclusions:

Plausible novelty effectively enhances exploration in actor-critic algorithms.
The approach leads to significant gains in sample efficiency and training performance.
This method offers a promising direction for advancing deep reinforcement learning techniques.