Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Multicompartment Models: Overview

Multicompartment Models: Overview

Multicompartment models are mathematical constructs that depict how drugs are distributed and eliminated within the body. They segment the body into several compartments, symbolizing various physiological or anatomical areas connected through drug transfer processes such as absorption, metabolism, distribution, and elimination.
These models offer a more comprehensive representation of drug behavior in the body than one-compartment models. They accommodate the complexity of drug distribution,...

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

Propagation of Uncertainty from Systematic Error

Propagation of Uncertainty from Systematic Error

The atomic mass of an element varies due to the relative ratio of its isotopes. A sample's relative proportion of oxygen isotopes influences its average atomic mass. For instance, if we were to measure the atomic mass of oxygen from a sample, the mass would be a weighted average of the isotopic masses of oxygen in that sample. Since a single sample is not likely to perfectly reflect the true atomic mass of oxygen for all the molecules of oxygen on Earth, the mass we obtain from this...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Targeting Pyruvate Kinase M2: Signal Transduction Pathways and Exploration of Cancer Therapeutic Strategies.

The AAPS journal·2026

Same author

Natural Products Beyond Inhibition: A Mechanistic Framework Spanning Pockets, Interfaces, and Kinetic Barriers.

Molecules (Basel, Switzerland)·2026

Same author

Unraveling the pleiotropic effects of CCR2-dependent signal transduction in fibrosis development.

Theranostics·2026

Same author

Novel <i>VRK1</i> Variants and a Founder Effect in Axonal Polyneuropathy.

Neurology. Genetics·2026

Same author

From complex algorithms to clinical practice: a multicenter machine learning model and simplified decision tree for predicting cachexia risk in gastric cancer.

Frontiers in oncology·2026

Same author

The p75NTR signaling axis: Bridging neurodevelopmental homeostasis, pathological mechanisms, and therapeutic strategies in neurodegenerative diseases.

Ageing research reviews·2026

Same journal

RETRACTED: Zhang et al. A Novel Framework for Reconstruction and Imaging of Target Scattering Centers via Wide-Angle Incidence in Radar Networks. <i>Sensors</i> 2025, <i>25</i>, 6802.

Sensors (Basel, Switzerland)·2026

Same journal

Enhancing Unsupervised Multi-Source Domain Adaptation for Person Re-Identification via Mixture of Experts and Graph-Based Relation.

Sensors (Basel, Switzerland)·2026

Same journal

Development of an Instrumented Glove for Palmar Pressure Assessment in Kayakers.

Sensors (Basel, Switzerland)·2026

Same journal

Development and Experimental Validation of an Autonomous IoT-Based Monitoring System for Real-Time Water Quality Assessment in the Amazon River.

Sensors (Basel, Switzerland)·2026

Same journal

Semi-Supervised Adversarial Learning Framework for Controller Area Network Bus Intrusion Detection.

Sensors (Basel, Switzerland)·2026

Same journal

Smart Optimization Method for Safety Signs in Innovative Manufacturing Environments Integrating Industrial Field IoT Sensors and Knowledge Graphs.

Sensors (Basel, Switzerland)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 9, 2025

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Published on: December 9, 2012

A Policy Gradient Algorithm to Alleviate the Multi-Agent Value Overestimation Problem in Complex Environments.

Yang Yang^1,2, Jiang Li^1,2, Jinyong Hou³

¹Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China.

Sensors (Basel, Switzerland)

|December 9, 2023

Summary

This summary is machine-generated.

We introduce the empirical clustering layer-based multi-agent dual dueling policy gradient (ECL-MAD3PG) algorithm to improve multi-agent reinforcement learning. This novel approach enhances reliability and stability, achieving a 9.1% mission completion improvement in UAV combat simulations.

Keywords:

deep deterministic policy gradient group decision-making overestimation of value function playback of experience

More Related Videos

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

The HoneyComb Paradigm for Research on Collective Human Behavior

The HoneyComb Paradigm for Research on Collective Human Behavior

Published on: January 19, 2019

Related Experiment Videos

Last Updated: Jul 9, 2025

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Spatial Multiobjective Optimization of Agricultural Conservation Practices using a SWAT Model and an Evolutionary Algorithm

Published on: December 9, 2012

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

The HoneyComb Paradigm for Research on Collective Human Behavior

The HoneyComb Paradigm for Research on Collective Human Behavior

Published on: January 19, 2019

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Multi-agent reinforcement learning (MARL) is crucial for group decision-making in complex, high-dimensional environments.
Existing deep policy gradient methods face challenges with reliability, stability, and convergence due to estimation errors and degraded experience quality.
These limitations hinder performance in demanding applications like autonomous systems.

Purpose of the Study:

To develop a novel MARL algorithm addressing the limitations of current deep policy gradient methods.
To enhance the reliability, stability, and convergence of decision-making algorithms in complex state-action spaces.
To improve the efficiency of experience sampling and overall algorithm performance.

Main Methods:

Proposing the empirical clustering layer-based multi-agent dual dueling policy gradient (ECL-MAD3PG) algorithm.
Integrating an empirical clustering layer to refine experience quality and sampling efficiency.
Utilizing a dual dueling architecture to improve value estimation accuracy.

Main Results:

The ECL-MAD3PG algorithm demonstrated superior performance across various complex environments.
Achieved a significant 9.1% improvement in mission completion compared to the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm.
Showcased enhanced reliability and stability in challenging scenarios, particularly in UAV cooperative combat.

Conclusions:

ECL-MAD3PG effectively overcomes the convergence and stability issues of traditional MARL algorithms.
The proposed algorithm offers a robust solution for complex, high-dimensional decision-making problems.
ECL-MAD3PG shows significant promise for applications requiring reliable and adaptive multi-agent coordination.