Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Cooperative Allosteric Transitions

Cooperative Allosteric Transitions

Multi-Step Reactions

Multi-Step Reactions

Chemical reactions often occur in a stepwise fashion involving two or more distinct reactions taking place in a sequence. A balanced equation indicates the reacting species and the product species, but it reveals no details about how the reaction occurs at the molecular level. The reaction mechanism (or reaction path) provides details regarding the precise, step-by-step process by which a reaction occurs. Each of the steps in a reaction mechanism is called an elementary reaction. These...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Evolving Diverse Red-team Language Models in Multi-round Multi-agent Games.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

The CXCL9/SPP1 polarity axis in tumor-associated macrophages: immunoregulatory and prognostic significance in non-small cell lung cancer.

Frontiers in immunology·2026

Same author

Subspecialty-specific foundation model for intelligent gastrointestinal pathology.

NPJ digital medicine·2026

Same author

Multi-omics integration reveals that pyrimidine metabolism in lung adenocarcinoma drives an immunosuppressive microenvironment.

iScience·2026

Same author

Development and prospective shadow evaluation of a domain-specific large language model for emergency neurological diagnosis.

NPJ digital medicine·2026

Same author

Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management.

Production and operations management·2026

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

IGFD-Net: Illumination-guided frequency decoupling for polarization image fusion.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Multiple-Strategies dung beetle optimizer and its applications in engineering optimization and bankruptcy prediction.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Aggregating global-scale pixel-wise forgery cues within a graph.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Finite-Time intermittent control for secure synchronization of Neutral-Type stochastic delayed neural networks under aperiodic DoS attacks.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 3, 2025

Automated Interactive Video Playback for Studies of Animal Communication

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

TIMAR: Transition-informed representation for sample-efficient multi-agent reinforcement learning.

Mingxiao Feng¹, Yaodong Yang², Wengang Zhou¹

¹CAS Key Laboratory of GIPAS, University of Science and Technology of China, Hefei, China; Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Hefei, China.

Neural Networks : the Official Journal of the International Neural Network Society

|January 7, 2025

Summary

This summary is machine-generated.

Enhancing data efficiency in Multi-Agent Reinforcement Learning (MARL) is crucial. The novel Transition-Informed Multi-Agent Representations (TIMAR) framework uses a world model to improve agent coordination and learning efficiency.

Keywords:

Multi-agent reinforcement learning Representation learning Self-supervised learning Transformers

More Related Videos

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

A Fully Automated Rodent Conditioning Protocol for Sensorimotor Integration and Cognitive Control Experiments

A Fully Automated Rodent Conditioning Protocol for Sensorimotor Integration and Cognitive Control Experiments

Published on: April 15, 2014

Related Experiment Videos

Last Updated: Jun 3, 2025

Automated Interactive Video Playback for Studies of Animal Communication

Automated Interactive Video Playback for Studies of Animal Communication

Published on: February 9, 2011

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

A Fully Automated Rodent Conditioning Protocol for Sensorimotor Integration and Cognitive Control Experiments

A Fully Automated Rodent Conditioning Protocol for Sensorimotor Integration and Cognitive Control Experiments

Published on: April 15, 2014

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Multi-Agent Reinforcement Learning (MARL) faces challenges with high training costs due to massive interaction requirements.
Partial observability in MARL hinders agents' ability to model interactions and coordination from an egocentric perspective, impeding data efficiency.

Purpose of the Study:

To develop a world-model-driven paradigm that enhances data efficiency in MARL by enabling holistic environmental representations.
To introduce the Transition-Informed Multi-Agent Representations (TIMAR) framework for improved agent learning and coordination.

Main Methods:

Leveraging a joint transition model (surrogate world model) to capture multi-agent system dynamics.
Employing a self-supervised learning objective that encourages consistency between predicted and actual future observations.
Incorporating an auxiliary module for predicting future transitions to infer latent states and agent influences.

Main Results:

TIMAR significantly improves performance and data efficiency in various MARL environments compared to strong baselines (MAPPO, HAPPO, QMIX, MAT, MA2CL).
The framework enables learning semantic representations from high-dimensional observations, boosting the data efficiency of downstream MARL algorithms.
TIMAR enhances the generalization capabilities of Transformer-based MARL algorithms like MAT.

Conclusions:

The TIMAR framework offers a novel approach to address data efficiency limitations in MARL.
By learning effective representations through a world model, TIMAR facilitates better agent interaction and coordination.
This research paves the way for more efficient and capable MARL systems.