Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Distribution Reliability and Automation

Distribution Reliability and Automation

Distribution reliability in electrical power systems is critical for ensuring an uninterrupted power supply to consumers at minimal cost. According to IEEE Standard Terms, reliability is the probability that a device will function without failure over a specified time period or amount of usage. For electric power distribution, this translates to maintaining continuous power supply and addressing customer concerns over power outages. Several indices, as defined by IEEE Standard 1366-2012, are...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Transformers in Distribution System

Transformers in Distribution System

Transformers in distribution systems can be broadly categorized into distribution substation transformers and other distribution transformers. They are crucial for stepping down high transmission voltages to levels suitable for distribution and end-user applications.
Distribution substation transformers come in various ratings and typically use mineral oil for insulation and cooling. To prevent moisture and air from entering the oil, some transformers use an inert gas like nitrogen to fill the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Herpes zoster as a vaccine-preventable risk factor increases the risk of dementia: A nested case-control study in Chinese population.

Human vaccines & immunotherapeutics·2026

Same author

Biomimetic Microstructured Scaffold with Release of Re-Modified Teriparatide for Osteoporotic Tendon-to-Bone Regeneration via Balancing Bone Homeostasis.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2025

Same author

WAKE: Towards Robust and Physically Feasible Trajectory Prediction for Autonomous Vehicles With WAvelet and KinEmatics Synergy.

IEEE transactions on pattern analysis and machine intelligence·2025

Same author

Real-time accident anticipation for autonomous driving through monocular depth-enhanced 3D modeling.

Accident; analysis and prevention·2024

Same author

Learning Disentangled Representation for One-Shot Progressive Face Swapping.

IEEE transactions on pattern analysis and machine intelligence·2024

Same author

Efficient and robust estimation of single-vehicle crash severity: A mixed logit model with heterogeneity in means and variances.

Accident; analysis and prevention·2023

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Physics-encoded convolutional neural operators for parametric PDEs: A convergence-guaranteed framework via pre-computed kernel fields.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 4, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

A fully value distributional deep reinforcement learning framework for multi-agent cooperation.

Mingsheng Fu¹, Liwei Huang¹, Fan Li²

¹School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, China.

Neural Networks : the Official Journal of the International Neural Network Society

|December 18, 2024

Summary

This summary is machine-generated.

This study introduces a new framework for fully distributional multi-agent reinforcement learning (RL) that guarantees the individual-global-max principle. The proposed Fully Distributional Multi-Agent Cooperation (FDMAC) model significantly improves performance in complex cooperative tasks.

Keywords:

Deep reinforcement learning Distributional reinforcement learning Multi-agent cooperation Neural networks

More Related Videos

Automated Interactive Video Playback for Studies of Animal Communication

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Related Experiment Videos

Last Updated: Jun 4, 2025

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Automated Interactive Video Playback for Studies of Animal Communication

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

A Networked Desktop Virtual Reality Setup for Decision Science and Navigation Experiments with Multiple Participants

Published on: August 26, 2018

Area of Science:

Artificial Intelligence
Machine Learning
Multi-Agent Systems

Background:

Distributional Reinforcement Learning (RL) models the entire return distribution, offering richer insights than expected values.
Existing distributional multi-agent systems struggle to satisfy the individual-global-max (IGM) principle when using traditional value-decomposition.
A fully distributional multi-agent system requires both individual and global value functions to be in distributional forms.

Purpose of the Study:

To propose a novel fully value distributional multi-agent framework that guarantees the IGM principle.
To introduce a practical deep reinforcement learning model, Fully Distributional Multi-Agent Cooperation (FDMAC), based on this framework.
To validate the effectiveness of FDMAC in complex multi-agent cooperative scenarios.

Main Methods:

Developed a new value-decomposition framework for fully distributional multi-agent systems.
Proved that the proposed framework ensures the satisfaction of the IGM principle.
Implemented the Fully Distributional Multi-Agent Cooperation (FDMAC) deep reinforcement learning model.

Main Results:

The proposed framework guarantees the IGM principle in fully distributional multi-agent systems.
The FDMAC model demonstrated superior performance in the StarCraft Multi-Agent Challenge.
FDMAC achieved an average improvement of 10.47% in median test win rate over the best baseline.

Conclusions:

The novel framework effectively addresses limitations in existing distributional multi-agent RL.
FDMAC represents a significant advancement in cooperative multi-agent reinforcement learning.
The results highlight the benefits of fully distributional value functions in complex cooperative tasks.