Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Modeling in Therapy

Modeling in Therapy

Modeling, a key technique in therapy, uses observational learning to help clients acquire and practice new skills by watching therapists demonstrate desired behaviors. This approach, rooted in Albert Bandura's concept of vicarious learning, plays a significant role in therapeutic interventions for various psychological conditions, including social anxiety, ADHD, and depression.
Participant Modeling
Participant modeling involves therapists demonstrating calm and effective behaviors in...

Steps in the Modeling Process

Steps in the Modeling Process

Albert Bandura's theory of observational learning identifies four critical processes: attention, retention, motor reproduction, and reinforcement or motivation.
Attention is the first necessary component for observational learning. It involves focusing on what the model is doing and saying. For example, if you decide to take a drawing class to enhance your skills, you need to pay close attention to the instructor's words and hand movements. The characteristics of the model significantly...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Modeling and Similitude

Modeling and Similitude

Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

RxMap: an LLM-assisted tool for medication normalization.

JAMIA open·2026

Same author

Monocyte epigenetic age acceleration is linked to non-somatic depressive symptoms in women with and without HIV.

The journals of gerontology. Series A, Biological sciences and medical sciences·2026

Same author

Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions.

Journal of machine learning research : JMLR·2026

Same author

Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables.

Journal of machine learning research : JMLR·2025

Same author

Modeling Alzheimer's Disease Biomarkers' Trajectory in the Absence of a Gold Standard Using a Bayesian Approach.

Statistics in medicine·2025

Same author

Multiplexing Proteomic and Ingenuity Pathway Analysis of Attention/Working Memory in Virally Suppressed Women with HIV: A Feasibility Study.

Diagnostics (Basel, Switzerland)·2025

Same journal

Scaling Up Bayesian Neural Networks with Neural Networks.

Transactions on machine learning research·2026

Same journal

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation.

Transactions on machine learning research·2026

Same journal

Still Competitive: Revisiting Recurrent Models for Irregular Time Series Prediction.

Transactions on machine learning research·2026

Same journal

Multi-Modal Foundation Models for Computational Pathology: A Survey.

Transactions on machine learning research·2026

Same journal

Sparse-Input Neural Network using Group Concave Regularization.

Transactions on machine learning research·2026

Same journal

Bayesian Neighborhood Adaptation for Graph Neural Networks.

Transactions on machine learning research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 7, 2026

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms

Published on: August 15, 2016

MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning.

Mao Hong¹, Zhiyue Zhang¹, Yue Wu¹

¹Department of Applied Mathematics and Statistics, Johns Hopkins University.

Transactions on Machine Learning Research

|December 29, 2025

Summary

This summary is machine-generated.

Model-based offline reinforcement learning (RL) methods are powerful but limited by restricted policy spaces. We introduce MoMA, a novel algorithm using unrestricted policies for improved decision-making in offline RL.

More Related Videos

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain

Published on: April 20, 2019

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Related Experiment Videos

Last Updated: Jan 7, 2026

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms

Published on: August 15, 2016

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain

Published on: April 20, 2019

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Area of Science:

Artificial Intelligence
Machine Learning
Control Theory

Background:

Model-based offline reinforcement learning (RL) offers sample efficiency and generalizability.
Existing methods often use restricted policy spaces, limiting their potential.
There's a need for practical, model-based offline RL with unrestricted policies.

Purpose of the Study:

To develop a model-based offline RL algorithm, MoMA, that utilizes general function approximations and an unrestricted policy class.
To address the limitations of existing approaches in leveraging the full advantages of model-based methods.
To provide theoretical guarantees and a practical implementation for the proposed algorithm.

Main Methods:

Developed MoMA, a model-based mirror ascent algorithm for offline RL.
Employed general function approximations for policy updates, moving beyond restricted parametric classes.
Incorporated conservative value function estimation within a confidence set of transition models.
Established theoretical guarantees on the suboptimality bound of the returned policy.

Main Results:

MoMA effectively utilizes an unrestricted policy class in model-based offline RL.
The algorithm demonstrates improved decision-making capabilities compared to existing methods.
Theoretical analysis provides an upper bound on the suboptimality of MoMA's policy.
A practical, approximate version of MoMA was developed and validated.

Conclusions:

MoMA represents a significant advancement in model-based offline RL by enabling unrestricted policies.
The algorithm offers a practical and theoretically sound approach for complex decision-making problems.
Numerical studies confirm the effectiveness and potential of MoMA in real-world applications.