Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Constraints and Statical Determinacy

Constraints and Statical Determinacy

In structural engineering, the equilibrium of a system is not only determined by its equations of equilibrium but also with the help of constraints. Constraints refer to restrictions on the motion of a system. The proper combinations of constraints can minimize the total number of constraints needed to maintain a system in mechanical equilibrium. When this happens, the system is said to be statically determinate. For such systems, the unknown reaction supports can be estimated using equilibrium...

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Multicompartment Models: Overview

Multicompartment Models: Overview

Multicompartment models are mathematical constructs that depict how drugs are distributed and eliminated within the body. They segment the body into several compartments, symbolizing various physiological or anatomical areas connected through drug transfer processes such as absorption, metabolism, distribution, and elimination.
These models offer a more comprehensive representation of drug behavior in the body than one-compartment models. They accommodate the complexity of drug distribution,...

Steps in the Modeling Process

Steps in the Modeling Process

Albert Bandura's theory of observational learning identifies four critical processes: attention, retention, motor reproduction, and reinforcement or motivation.
Attention is the first necessary component for observational learning. It involves focusing on what the model is doing and saying. For example, if you decide to take a drawing class to enhance your skills, you need to pay close attention to the instructor's words and hand movements. The characteristics of the model significantly...

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Regenerative Polysulfide-Scavenging Layers Enabling Lithium-Sulfur Batteries with High Energy Density and Prolonged Cycling Life.

ACS nano·2017

Same author

PdAuCu Nanobranch as Self-Repairing Electrocatalyst for Oxygen Reduction Reaction.

ChemSusChem·2017

Same author

Trapdoor spiders of the genus <i>Cyclocosmia</i> Ausserer, 1871 from China and Vietnam (Araneae, Ctenizidae).

ZooKeys·2017

Same author

The complete genome sequence, occurrence and host range of Tomato mottle mosaic virus Chinese isolate.

Virology journal·2017

Same author

Tunneling nanotubes promote intercellular mitochondria transfer followed by increased invasiveness in bladder cancer cells.

Oncotarget·2017

Same author

Assessment of histopathological features of needle biopsy in recurrent prostate cancer following salvage high-intensity focused ultrasound.

Canadian Urological Association journal = Journal de l'Association des urologues du Canada·2017

Same journal

Dynamic analysis and reliable mechanical optimization application of ring HNN effected with a memristive neuron.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

DAFF-Net: A detection and search method for small-scale low surface brightness galaxies.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Quasi-synchronization for complex networks with hybrid pinning intermittent control.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Physics-encoded convolutional neural operators for parametric PDEs: A convergence-guaranteed framework via pre-computed kernel fields.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Exploiting audio-visual modalities in videos: Object detection via multi-stage bilateral coupling network.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Reliability-aware modality completion with cross-modal distillation for federated learning with missing modalities.

Neural networks : the official journal of the International Neural Network Society·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 28, 2025

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

Incremental model-based reinforcement learning with model constraint.

Zhiyou Yang¹, Mingsheng Fu¹, Hong Qu¹

¹School of Computer Science and Engineering, University of Electronic Science and Technology of China, No. 2006 Xiyuan Ave, Chengdu, 611731, Sichuan, China.

Neural Networks : the Official Journal of the International Neural Network Society

|February 11, 2025

Summary

This summary is machine-generated.

This study introduces an incremental model-based reinforcement learning (RL) update scheme, ensuring stable model and policy improvements. The novel Incremental Model-based Policy Optimization (IMPO) algorithm enhances performance and sample efficiency in complex control tasks.

Keywords:

Model constraint Model-based reinforcement learning Monotonic performance improvement Policy optimization

More Related Videos

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

A Conflict Model of Reward-seeking Behavior in Male Rats

A Conflict Model of Reward-seeking Behavior in Male Rats

Published on: February 20, 2019

Related Experiment Videos

Last Updated: May 28, 2025

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Investigating Motor Skill Learning Processes with a Robotic Manipulandum

Published on: February 12, 2017

A Conflict Model of Reward-seeking Behavior in Male Rats

A Conflict Model of Reward-seeking Behavior in Male Rats

Published on: February 20, 2019

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Model-based reinforcement learning (RL) relies on environment models learned from limited data for policy optimization.
Existing methods face performance limitations due to incomplete incremental updates in both policy and model estimations.
This gap hinders the reliable performance improvement of model-based RL algorithms.

Purpose of the Study:

To propose a novel incremental update scheme for model-based RL.
To guarantee simultaneous incremental updates for both the environment model and the policy.
To ensure non-decreasing policy performance in the real environment.

Main Methods:

Developed an incremental model-based RL update scheme with guaranteed incremental model and policy constraints.
Established a theoretical performance bound linking the real environment and the learned model.
Introduced the Incremental Model-based Policy Optimization (IMPO) algorithm for practical implementation.

Main Results:

IMPO demonstrates superior performance compared to state-of-the-art model-based RL methods.
The algorithm achieves significant improvements in sample efficiency across various control benchmarks.
Experimental validation confirms the effectiveness of the incremental update scheme.

Conclusions:

The proposed incremental update scheme enhances stability and performance in model-based RL.
IMPO offers a practical and efficient solution for complex control problems.
This work advances the reliability and sample efficiency of model-based RL approaches.