Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Dynamic Equilibrium

Dynamic Equilibrium

A reversible chemical reaction represents a chemical process that proceeds in both forward (left to right) and reverse (right to left) directions. When the rates of the forward and reverse reactions are equal, the concentrations of the reactant and product species remain constant over time and the system is at equilibrium. A special double arrow is used to emphasize the reversible nature of the reaction. The relative concentrations of reactants and products in equilibrium systems vary greatly;...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

PD Controller: Design

PD Controller: Design

In automotive engineering, car suspension systems often employ Proportional Derivative (PD) controllers to enhance performance. PD controllers are utilized to adjust the damping force in response to road conditions. A controller, acting as an amplifier with a constant gain, demonstrates proportional control, with output directly mirroring input.
Designing a continuous-data controller requires selecting and linking components like adders and integrators, which are fundamental in Proportional,...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Valorizing ginseng residues using dimethyl ether: Recovering functional lipids and ginsenosides.

Journal of ginseng research·2026

Same author

Oncological Outcomes of Non-Muscle Invasive Bladder Cancer in Patients Aged ≤45: A Retrospective Matched Cohort Study.

The Journal of urology·2026

Same author

Identification of the optimal manipulation medium temperature for in vitro handling of oocytes and embryos during in vitro maturation, parthenogenesis, and in vitro culture in pigs.

The Journal of reproduction and development·2026

Same author

Household-level surrounding greenspace as a nature-based intervention for health recovery after occupational injury.

Frontiers in public health·2026

Same author

Axial-time mapping: A diagnostic method to reveal concealed long-term catalyst deactivation mechanism in CO<sub>2</sub> hydrogenation.

Science advances·2026

Same author

An Integrative Proteomic Approach to Reveal Altered Signaling Modules During Alzheimer's Disease Progression in PS19 Tauopathy Mice.

Molecular & cellular proteomics : MCP·2026

Same journal

Granular Ball-Based Noise-Resistant Fuzzy Multineighborhood Feature Selection via Label Enhancement and Feature Graph.

IEEE transactions on neural networks and learning systems·2026

Same journal

Fighting Evolving Spam With ARTMAP Models: A Noise-Resilient Online Detection Framework.

IEEE transactions on neural networks and learning systems·2026

Same journal

HyperSAT: Unsupervised Hypergraph Neural Networks for Weighted MaxSAT Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

Negation of Basic Belief Assignment in Multisource Information Fusion on Dempster-Shafer Theory With Applications in Pattern Classification.

IEEE transactions on neural networks and learning systems·2026

Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 17, 2025

A Real-Time Interactive System for Studying Confrontational Pursuit Behavior in Rodents

A Real-Time Interactive System for Studying Confrontational Pursuit Behavior in Rodents

Published on: May 16, 2025

Masked and Inverse Dynamics Modeling for Data-Efficient Reinforcement Learning.

Young Jae Lee, Jaehoon Kim, Young Joon Park

IEEE Transactions on Neural Networks and Learning Systems

|August 14, 2024

Summary

This summary is machine-generated.

Masked and Inverse Dynamics Modeling (MIND) enhances data efficiency in deep reinforcement learning by learning agent-controllable representations in changing states. This self-supervised approach improves performance in control environments with limited interactions.

More Related Videos

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Related Experiment Videos

Last Updated: Jun 17, 2025

A Real-Time Interactive System for Studying Confrontational Pursuit Behavior in Rodents

A Real-Time Interactive System for Studying Confrontational Pursuit Behavior in Rodents

Published on: May 16, 2025

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

A Step-by-Step Implementation of DeepBehavior, Deep Learning Toolbox for Automated Behavior Analysis

Published on: February 6, 2020

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Deep reinforcement learning (DRL) faces challenges in data efficiency, particularly in learning state representations that evolve due to agent interaction.
Existing methods integrating self-supervised learning (SSL) and data augmentation struggle to explicitly capture these changing state dynamics or select appropriate augmentations.

Purpose of the Study:

To explicitly learn the inherent dynamics of changing states influenced by agent actions and environmental interactions.
To improve data efficiency in pixel-based DRL by learning robust and agent-controllable state representations.

Main Methods:

Proposed Masked and Inverse Dynamics Modeling (MIND), a self-supervised multitask learning framework using a transformer architecture.
MIND employs masked modeling for static visual representations and inverse dynamics modeling for evolving state representations, utilizing masking augmentation.
The method requires fewer hyperparameters and captures spatiotemporal information from consecutive frames.

Main Results:

MIND demonstrated superior performance across discrete and continuous control benchmarks with limited interactions.
The approach significantly improved data efficiency compared to previous methods.
Successfully learned agent-controllable representations in dynamic environments.

Conclusions:

MIND effectively learns evolving state representations by combining masked and inverse dynamics modeling.
The proposed method offers a more data-efficient and robust approach to DRL in complex environments.
The framework provides a promising direction for advancing DRL research and applications.