Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Effects of feedback01:24

Effects of feedback

1.1K
Feedback in control systems plays a critical role in shaping various operational parameters, extending beyond simple error reduction to influence stability, bandwidth, gain, impedance, and sensitivity. Understanding these effects requires examining a basic feedback system characterized by defined input, output, error, and feedback signals.
Feedback significantly modifies the gain of a control system. The gain of a system without feedback is altered by a factor of one plus GH, where G represents...
1.1K
Propagation of Uncertainty from Random Error00:59

Propagation of Uncertainty from Random Error

2.1K
An experiment often consists of more than a single step. In this case, measurements at each step give rise to uncertainty. Because the measurements occur in successive steps, the uncertainty in one step necessarily contributes to that in the subsequent step. As we perform statistical analysis on these types of experiments, we must learn to account for the propagation of uncertainty from one step to the next. The propagation of uncertainty depends on the type of arithmetic operation performed on...
2.1K
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

387
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
387
Reinforcement01:23

Reinforcement

1.1K
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
1.1K
Time-Domain Interpretation of PD Control01:07

Time-Domain Interpretation of PD Control

433
Proportional-Derivative (PD) control is a widely used control method in various engineering systems to enhance stability and performance. In a system with only proportional control, common issues include high maximum overshoot and oscillation, observed in both the error signal and its rate of change. This behavior can be divided into three distinct phases: initial overshoot, subsequent undershoot, and gradual stabilization.
Consider the example of control of motor torque. Initially, a positive...
433
Propagation of Uncertainty from Systematic Error01:10

Propagation of Uncertainty from Systematic Error

1.6K
The atomic mass of an element varies due to the relative ratio of its isotopes. A sample's relative proportion of oxygen isotopes influences its average atomic mass. For instance, if we were to measure the atomic mass of oxygen from a sample, the mass would be a weighted average of the isotopic masses of oxygen in that sample. Since a single sample is not likely to perfectly reflect the true atomic mass of oxygen for all the molecules of oxygen on Earth, the mass we obtain from this...
1.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Transition metal-coordinated metastable [MoS<sub>4</sub>]<sup>2-</sup> cluster for SO<sub>2</sub>-facilitated gaseous mercury adsorption from wet flue gas.

Journal of environmental sciences (China)·2026
Same author

The first-in-human ENCIT01 trial comparing second- versus third-generation L1CAM-specific CAR T cells in patients with primary refractory or relapsed neuroblastoma.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026
Same author

Seroprevalence and associated risk factors for feline panleukopenia virus infection among managed giant pandas in China.

Veterinary research·2026
Same author

A novel chemical engineering system design for synergistic SO<sub>2</sub> reduction and CH<sub>4</sub>/CO<sub>2</sub> reforming.

Environmental research·2026
Same author

Machine learning model-guided selective use of temporary diverting ileostomy in rectal cancer surgery: a randomized controlled trial.

Nature communications·2026
Same author

Generation of HBV cccDNA using single-stranded M13 phage DNA for authentic minichromosome functionality.

Journal of virology·2026
Same journal

Granular Ball-Based Noise-Resistant Fuzzy Multineighborhood Feature Selection via Label Enhancement and Feature Graph.

IEEE transactions on neural networks and learning systems·2026
Same journal

Fighting Evolving Spam With ARTMAP Models: A Noise-Resilient Online Detection Framework.

IEEE transactions on neural networks and learning systems·2026
Same journal

HyperSAT: Unsupervised Hypergraph Neural Networks for Weighted MaxSAT Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

Negation of Basic Belief Assignment in Multisource Information Fusion on Dempster-Shafer Theory With Applications in Pattern Classification.

IEEE transactions on neural networks and learning systems·2026
Same journal

Intervention Feasible Region and Driver Risk Capacity Aware Human-Machine Collaborative Safe Trajectory Planning.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Unified Differential Denoising Learning Framework With a Pre-Trained Model and Fuzzy Graph Networks for Drug-Drug Interaction Prediction.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Videos

Enhancing Stability of Probabilistic Model-Based Reinforcement Learning by Adaptive Noise Filtering.

Wenjun Huang, Xinrui Yue, Yidong Chen

    IEEE Transactions on Neural Networks and Learning Systems
    |March 17, 2026
    PubMed
    Summary
    This summary is machine-generated.

    Stabilized Model-Based Policy Optimization (SMBPO) enhances reinforcement learning by filtering model prediction noises and clipping values. This approach significantly boosts learning efficiency and performance in complex control tasks.

    Related Experiment Videos

    Area of Science:

    • Artificial Intelligence
    • Machine Learning
    • Robotics

    Background:

    • Current probabilistic model-based reinforcement learning (MBRL) methods face challenges with stability and efficiency due to imperfect models.
    • Model bias and prediction noise can negatively impact policy learning and overall performance.

    Purpose of the Study:

    • To introduce Stabilized Model-Based Policy Optimization (SMBPO) for improved stability and efficiency in MBRL.
    • To address noise and bias issues inherent in probabilistic model-based approaches.

    Main Methods:

    • SMBPO adaptively refines dimensions with abnormal prediction distributions to stabilize probabilistic model training.
    • It clips predicted states and estimated value functions to mitigate model bias effects on policy learning.
    • Batch Normalization (BN) is integrated to enhance learning efficiency.

    Main Results:

    • Evaluations on MuJoCo control benchmarks and a dexterous hand task demonstrated SMBPO's effectiveness.
    • SMBPO achieved a 90% reduction in training time compared to baselines.
    • The method resulted in 50% more cumulative rewards than state-of-the-art model-free and MBRL approaches.

    Conclusions:

    • SMBPO offers a stable and efficient solution for model-based reinforcement learning.
    • The technique significantly enhances learning speed and cumulative rewards.
    • SMBPO extends the practical applicability of MBRL in complex robotic control scenarios.