Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

Primary and Secondary Reinforcers

Primary and Secondary Reinforcers

In psychology, reinforcement is a key concept in behavior modification. B.F. Skinner demonstrated this with his experiments involving rats in what is known as a Skinner box. The rats learned to press a lever to receive food, a primary reinforcer that fulfilled their innate need for nourishment.
Effective reinforcers for humans vary depending on the individual and the context. Primary reinforcers, such as food, water, sleep, shelter, and pleasure, have inherent value and satisfy basic biological...

Law of Effect

Law of Effect

B.F. Skinner, a prominent figure in behavioral psychology, introduced operant conditioning by emphasizing the role of consequences in shaping behavior. This theory builds upon the law of effect proposed by Edward Thorndike, which posits that behaviors followed by satisfying outcomes are likely to be repeated. In contrast, those followed by unsatisfying outcomes are less likely to recur.
Edward Thorndike's foundational work involved studying learning in animals, particularly using puzzle...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Behavior Modification

Behavior Modification

Behavioral approaches have often been criticized for ignoring mental processes and focusing solely on observable behavior. However, these approaches provide an optimistic perspective for individuals seeking to change their behaviors. Rather than concentrating on intrinsic personality traits, behavioral approaches suggest that even longstanding habits can be modified by changing the reward contingencies that maintain them.
A real-world application of operant conditioning principles is applied...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Global Landscape and Translational Trajectories of Pelvic Floor Muscle Rehabilitation for Urinary Incontinence.

International urogynecology journal·2026

Same author

Putative buffering roles of two-way social support and psychological resilience in the association between nurse-patient conflict and situational emotional response: a cross-sectional correlational study among Chinese nursing interns.

BMC nursing·2026

Same author

Immunomodulatory and Gut Microbiota-Regulating Effects of Lactobacillus helveticus LH76 in Healthy Adults: Preclinical Safety Assessment and a Randomized, Double-Blind, Placebo-Controlled Trial.

Probiotics and antimicrobial proteins·2026

Same author

Engineering Crystalline Frameworks into Porous Liquids to Fabricate Graphene Oxide/Porous Liquid Membranes for Efficient Li<sup>+</sup>/Mg<sup>2+</sup> Separation.

Nature communications·2026

Same author

Targeting TMED4 enhances CD8<sup>+</sup> T cell function and CAR T cell efficacy in solid tumors through the IRE1α-autophagy axis.

Science advances·2026

Same author

EUV mask modeling based on a wide-angle full-vector beam propagation method.

Optics express·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 12, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Kernel-Based Decentralized Policy Evaluation for Reinforcement Learning.

Jiamin Liu, Heng Lian

IEEE Transactions on Neural Networks and Learning Systems

|September 17, 2024

Summary

This summary is machine-generated.

This study introduces a decentralized, nonparametric approach for policy evaluation in reinforcement learning (RL). It establishes statistical error bounds for value function estimation in collaborative multi-agent systems.

More Related Videos

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

Related Experiment Videos

Last Updated: Jun 12, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

Area of Science:

Artificial Intelligence
Machine Learning
Optimization Theory

Background:

Decentralized learning is crucial for multi-agent reinforcement learning (RL).
Nonparametric methods offer flexibility but pose computational challenges.
Policy evaluation requires accurate state-value function estimation.

Purpose of the Study:

To develop a decentralized nonparametric method for policy evaluation in RL.
To analyze the statistical convergence properties of the proposed method.
To address computational and communication feasibility in multi-agent settings.

Main Methods:

Utilizing a regression-based multistage iteration technique.
Employing infinite-dimensional gradient descent (GD) in a reproducing kernel Hilbert space (RKHS).
Applying Nyström approximation for finite-dimensional projection to enhance feasibility.

Main Results:

Establishing the first statistical error bounds for value function estimation in a fully decentralized nonparametric framework.
Demonstrating the convergence of the proposed method.
Comparing the regression-based approach with the kernel temporal difference (TD) method through numerical studies.

Conclusions:

The proposed method provides a statistically sound and computationally feasible solution for decentralized nonparametric policy evaluation.
The established error bounds offer theoretical guarantees for the convergence of value function estimation.
This work advances the understanding and application of RL in complex multi-agent systems.