Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Collisions in Multiple Dimensions: Problem Solving01:06

Collisions in Multiple Dimensions: Problem Solving

5.7K
In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...
5.7K
Masking and Demasking Agents01:19

Masking and Demasking Agents

4.0K
EDTA titrations may necessitate masking and demasking agents to temporarily protect a particular metal ion in a mixture from the EDTA reaction. These agents facilitate the sequential analysis of the metal ions by forming stable complexes with some—but not all—metal ions during certain steps.
There are many masking agents, such as cyanide, fluoride, triethanolamine, thiourea, and 2,3-bis(sulfanyl)propan-1-ol (formerly 2,3-dimercapto-1-propanol), with the masking agent chosen based on...
4.0K
Decision Making: P-value Method01:09

Decision Making: P-value Method

7.4K
The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim  is also stated. These statements can act as null and alternative hypotheses:  a null hypothesis would be a neutral statement while the alternative hypothesis can...
7.4K
Robbers Cave04:49

Robbers Cave

15.1K
During the 1950s, the landmark Robbers Cave experiment demonstrated that when groups must compete with one another, intergroup conflict, hostility, and even violence may result. At the Oklahoman summer camp, two troops of boys—termed the Rattlers and the Eagles—took part in a week-long tournament. During this time, their negativity culminated in derogatory name-calling, fistfights, and even vandalism and destruction of property. However, this work also revealed that such tension...
15.1K
Reinforcement01:23

Reinforcement

1.2K
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
1.2K
Multi-input and Multi-variable systems01:22

Multi-input and Multi-variable systems

508
Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence of...
508

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Efficacy and possible mechanism of <i>Kai-xin-san</i> in animal models of Alzheimer's disease: a systematic review and meta-analysis of preclinical studies.

Frontiers in pharmacology·2026
Same author

Hyper-RAG: combating LLM hallucinations using hypergraph-driven retrieval-augmented generation.

Nature communications·2026
Same author

DAMind: Zero-Shot Visual Cross-Domain Alignment and Representation for EEG Decoding.

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society·2026
Same author

Observation of moiré plasmonic skyrmion clusters.

Science advances·2025
Same author

High-performance Data Management for Whole Slide Image Analysis in Digital Pathology.

Proceedings of SPIE--the International Society for Optical Engineering·2025
Same author

StructVPR++: Distill Structural and Semantic Knowledge With Weighting Samples for Visual Place Recognition.

IEEE transactions on pattern analysis and machine intelligence·2025
Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026
Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026
See all related articles

Related Experiment Video

Updated: Apr 15, 2026

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
07:14

Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

Published on: December 23, 2025

855

Enhancing Value Decomposition With Target Transformation in Cooperative Multi-Agent Reinforcement Learning.

Zeyang Liu, Lipeng Wan, Shiguang Sun

    IEEE Transactions on Pattern Analysis and Machine Intelligence
    |April 13, 2026
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces Uncertainty-aware Target Transformation (UT2) to improve cooperative multi-agent reinforcement learning (MARL). UT2 enhances performance and stability in MARL by addressing challenges with non-monotonic and stochastic learning targets.

    Related Experiment Videos

    Last Updated: Apr 15, 2026

    Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models
    07:14

    Virtual Agent for Real-Time Motivational Interviewing by Integrating Adaptive Nonverbal Behavior and Language Models

    Published on: December 23, 2025

    855

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Robotics

    Background:

    • Cooperative multi-agent reinforcement learning (MARL) is crucial for coordinating intelligent machines.
    • Dominant MARL methods using monotonic value decomposition offer scalability but limit representable joint action-values.
    • Existing solutions for non-monotonic targets can be unstable with stochastic returns, favoring suboptimal trajectories.

    Purpose of the Study:

    • To develop a novel approach for cooperative MARL that overcomes limitations of monotonic value decomposition.
    • To address the fragility of learning targets in stochastic environments.
    • To improve the performance and stability of MARL agents.

    Main Methods:

    • Proposed Target Transformation to map non-monotonic and stochastic learning targets to a monotonic-representable surrogate.
    • Developed Uncertainty-aware Target Transformation (UT2) with value-based and policy-based instantiations.
    • Integrated an uncertainty estimator with a best-individual coordination envelope.

    Main Results:

    • UT2 demonstrated improved performance and stability across diverse cooperative MARL benchmarks.
    • The method showed significant gains with increasing non-monotonicity and stochasticity in returns.
    • UT2 successfully preserves the optimal joint action while handling complex learning targets.

    Conclusions:

    • UT2 offers a robust solution for cooperative MARL, particularly in environments with non-monotonic and stochastic reward structures.
    • The approach enhances the reliability and effectiveness of multi-agent coordination.
    • This work advances the field of MARL by providing a more stable and performant learning framework.