Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Optimal Foraging

Optimal Foraging

How animals obtain and eat their food is called foraging behavior. Foraging can include searching for plants and hunting for prey and depends on the species and environment.

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Collisions in Multiple Dimensions: Problem Solving

Collisions in Multiple Dimensions: Problem Solving

In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

STE-DC2I Uncovers Driver Genes in Colorectal Cancer Subtypes Using Symbolic Trajectory-Embedded Dark Causal Inference.

Journal of chemical information and modeling·2026

Same author

M-Net: Multiscale hierarchical fusion with dual natural patch attention for spatial-Temporal time series forecasting.

Neural networks : the official journal of the International Neural Network Society·2026

Same author

Application of an l-Cysteine-Enhanced Ag NCs@C NF Amplification-Free ECL Biosensor for miRNA.

Analytical chemistry·2026

Same author

Photoelectrochemical detection of epigenetic 5-hydroxymethylcytosine based on Cu<sub>2</sub>O@CuO@Ag and self-triggered isothermal amplification.

Analytica chimica acta·2026

Same author

Signaling-Driven Incentive Communication for Enhanced Multiagent Reinforcement Learning in Dynamic Environments.

IEEE transactions on cybernetics·2025

Same author

An Efficient Electrochemiluminescence Biosensor Based on Ru(bpy)<sub>3</sub><sup>2+</sup>@AuNPs@SWCNTs for miRNAs Detection Using a Dual Engine-Triggered DNA Walker.

Analytical chemistry·2025

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Physics-encoded convolutional neural operators for parametric PDEs: A convergence-guaranteed framework via pre-computed kernel fields.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 8, 2026

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Published on: December 9, 2012

Graph based multi-agent reinforcement learning with evolutionary population for cooperation.

Kexing Peng¹, Hanwen Qi¹, Tinghuai Ma²

¹School of Computer Science, Nanjing University of Information Science & Technology, Nanjing, 210044, China.

Neural Networks : the Official Journal of the International Neural Network Society

|December 12, 2025

Summary

This summary is machine-generated.

This study introduces GDE, a novel Multi-Agent Reinforcement Learning (MARL) framework that enhances coordination in complex tasks. GDE combines Graph-based value Decomposition with staged Evolutionary policy optimization for improved agent performance.

Keywords:

Evolutionary algorithms Graph neural network Multi-agent reinforcement learning

More Related Videos

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

Published on: October 14, 2017

Predicting the Effectiveness of Population Replacement Strategy Using Mathematical Modeling

Predicting the Effectiveness of Population Replacement Strategy Using Mathematical Modeling

Published on: July 4, 2007

Related Experiment Videos

Last Updated: Jan 8, 2026

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Published on: December 9, 2012

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

The Modular Design and Production of an Intelligent Robot Based on a Closed-Loop Control Strategy

Published on: October 14, 2017

Predicting the Effectiveness of Population Replacement Strategy Using Mathematical Modeling

Predicting the Effectiveness of Population Replacement Strategy Using Mathematical Modeling

Published on: July 4, 2007

Area of Science:

Artificial Intelligence
Robotics
Computer Science

Background:

Existing Multi-Agent Reinforcement Learning (MARL) methods face challenges in scaling to complex coordination tasks due to limited agent observations and dynamic interactions.
Convergence to optimal policies is difficult as task complexity and policy space increase, impacting stable policy evaluations.

Purpose of the Study:

To propose GDE, a MARL framework designed to overcome scalability and convergence issues in cooperative multi-agent systems.
To enhance agent coordination and information propagation in dynamic environments without requiring state consensus.

Main Methods:

GDE integrates Graph-based value Decomposition with staged Evolutionary policy optimization.
Evolutionary Algorithms (EAs) are utilized for gradient-free random search to improve policy exploration and convergence.
Graph Neural Networks (GNNs) are employed to extend agent receptive fields and facilitate information propagation, leveraging permutation invariance for stable convergence with dynamic data.

Main Results:

GDE demonstrates superior performance in complex coordination tasks, including StarCraft II micro-management, MAMuJoCo robot cooperation, and SUMO autonomous driving.
The framework effectively captures complex coordination dynamics through multi-agent team formation and GNNs.
Experimental results validate the effectiveness and necessity of each module within the GDE framework.

Conclusions:

GDE offers a robust solution for enhancing coordination and policy convergence in MARL.
The proposed combination of graph-based decomposition and evolutionary optimization is effective for complex multi-agent systems.
The framework's modular design and adaptability make it suitable for diverse real-world applications.