Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Observational Learning01:12

Observational Learning

782
Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...
782
Modeling in Therapy01:26

Modeling in Therapy

360
Modeling, a key technique in therapy, uses observational learning to help clients acquire and practice new skills by watching therapists demonstrate desired behaviors. This approach, rooted in Albert Bandura's concept of vicarious learning, plays a significant role in therapeutic interventions for various psychological conditions, including social anxiety, ADHD, and depression.
Participant Modeling
Participant modeling involves therapists demonstrating calm and effective behaviors in...
360
Steps in the Modeling Process01:14

Steps in the Modeling Process

591
Albert Bandura's theory of observational learning identifies four critical processes: attention, retention, motor reproduction, and reinforcement or motivation.
Attention is the first necessary component for observational learning. It involves focusing on what the model is doing and saying. For example, if you decide to take a drawing class to enhance your skills, you need to pay close attention to the instructor's words and hand movements. The characteristics of the model significantly...
591
Reinforcement01:23

Reinforcement

781
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
781
Avoidance Learning and Learned Helplessness01:14

Avoidance Learning and Learned Helplessness

2.5K
Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...
2.5K
Modeling and Similitude01:12

Modeling and Similitude

573
Scaled modeling is a fundamental technique in engineering, enabling the study of large and complex systems by creating smaller, manageable replicas that recreate critical characteristics of the original. In hydrology and civil infrastructure, for example, scaled models of dams help analyze water flow, turbulence, and pressure. This method allows for accurate predictions of real-world behavior within a controlled environment, significantly reducing the cost and time involved in full-scale...
573

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

RxMap: an LLM-assisted tool for medication normalization.

JAMIA open·2026
Same author

Monocyte epigenetic age acceleration is linked to non-somatic depressive symptoms in women with and without HIV.

The journals of gerontology. Series A, Biological sciences and medical sciences·2026
Same author

Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions.

Journal of machine learning research : JMLR·2026
Same author

Directed Cyclic Graphs for Simultaneous Discovery of Time-Lagged and Instantaneous Causality from Longitudinal Data Using Instrumental Variables.

Journal of machine learning research : JMLR·2025
Same author

Modeling Alzheimer's Disease Biomarkers' Trajectory in the Absence of a Gold Standard Using a Bayesian Approach.

Statistics in medicine·2025
Same author

Multiplexing Proteomic and Ingenuity Pathway Analysis of Attention/Working Memory in Virally Suppressed Women with HIV: A Feasibility Study.

Diagnostics (Basel, Switzerland)·2025
Same journal

Scaling Up Bayesian Neural Networks with Neural Networks.

Transactions on machine learning research·2026
Same journal

Tumor-anchored deep feature random forests for out-of-distribution detection in lung cancer segmentation.

Transactions on machine learning research·2026
Same journal

Still Competitive: Revisiting Recurrent Models for Irregular Time Series Prediction.

Transactions on machine learning research·2026
Same journal

Multi-Modal Foundation Models for Computational Pathology: A Survey.

Transactions on machine learning research·2026
Same journal

Sparse-Input Neural Network using Group Concave Regularization.

Transactions on machine learning research·2026
Same journal

Bayesian Neighborhood Adaptation for Graph Neural Networks.

Transactions on machine learning research·2026
See all related articles

Related Experiment Video

Updated: Jan 7, 2026

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms
10:32

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms

Published on: August 15, 2016

15.9K

MoMA: Model-based Mirror Ascent for Offline Reinforcement Learning.

Mao Hong1, Zhiyue Zhang1, Yue Wu1

  • 1Department of Applied Mathematics and Statistics, Johns Hopkins University.

Transactions on Machine Learning Research
|December 29, 2025
PubMed
Summary
This summary is machine-generated.

Model-based offline reinforcement learning (RL) methods are powerful but limited by restricted policy spaces. We introduce MoMA, a novel algorithm using unrestricted policies for improved decision-making in offline RL.

More Related Videos

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain
11:29

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain

Published on: April 20, 2019

10.3K
Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine
07:05

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

9.6K

Related Experiment Videos

Last Updated: Jan 7, 2026

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms
10:32

Robotic Mirror Therapy System for Functional Recovery of Hemiplegic Arms

Published on: August 15, 2016

15.9K
Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain
11:29

Real-time Video Projection in an MRI for Characterization of Neural Correlates Associated with Mirror Therapy for Phantom Limb Pain

Published on: April 20, 2019

10.3K
Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine
07:05

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

9.6K

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Control Theory

Background:

  • Model-based offline reinforcement learning (RL) offers sample efficiency and generalizability.
  • Existing methods often use restricted policy spaces, limiting their potential.
  • There's a need for practical, model-based offline RL with unrestricted policies.

Purpose of the Study:

  • To develop a model-based offline RL algorithm, MoMA, that utilizes general function approximations and an unrestricted policy class.
  • To address the limitations of existing approaches in leveraging the full advantages of model-based methods.
  • To provide theoretical guarantees and a practical implementation for the proposed algorithm.

Main Methods:

  • Developed MoMA, a model-based mirror ascent algorithm for offline RL.
  • Employed general function approximations for policy updates, moving beyond restricted parametric classes.
  • Incorporated conservative value function estimation within a confidence set of transition models.
  • Established theoretical guarantees on the suboptimality bound of the returned policy.

Main Results:

  • MoMA effectively utilizes an unrestricted policy class in model-based offline RL.
  • The algorithm demonstrates improved decision-making capabilities compared to existing methods.
  • Theoretical analysis provides an upper bound on the suboptimality of MoMA's policy.
  • A practical, approximate version of MoMA was developed and validated.

Conclusions:

  • MoMA represents a significant advancement in model-based offline RL by enabling unrestricted policies.
  • The algorithm offers a practical and theoretically sound approach for complex decision-making problems.
  • Numerical studies confirm the effectiveness and potential of MoMA in real-world applications.