Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Orthogonal Trajectories

Orthogonal Trajectories

Orthogonal trajectories describe the geometric relationship between two families of curves that intersect each other at right angles. One illustrative case involves a family of parabolas that open sideways along the x-axis. These curves share a common shape but differ by a scaling parameter, resulting in a set of curves that all pass through the origin and widen at different rates.Determining Orthogonal TrajectoriesTo identify the orthogonal trajectories for these parabolas, the first step...

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Reinforcements in Concrete

Reinforcements in Concrete

Reinforced concrete is a composite material used extensively in construction, combining the compressive strength of concrete with the tensile strength of steel. This synergy is essential as concrete, while excellent at resisting compression, is weak under tension. Steel bars, or rebars, are embedded in the concrete to handle these tensile forces. The choice of steel is strategic; it shares a similar coefficient of thermal expansion with concrete, which ensures uniformity in response to...

Corrosion of Reinforcement

Corrosion of Reinforcement

The corrosion of steel reinforcement within concrete is a process influenced by the material's inherent properties and external factors. The high pH level of around 13, provided by calcium hydroxide present in concrete, initially protects the steel reinforcement by promoting the formation of a passive iron oxide layer on its surface.
However, over time and under certain conditions like carbonation, chloride ingress, and cracking this protective state can be compromised. Steel has areas with...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Reinforced Brick Masonry

Reinforced Brick Masonry

Reinforced brick masonry is an advanced construction technique that enhances the structural integrity of brick walls by incorporating steel reinforcements. These reinforcements are either placed within the hollow cores of bricks or sandwiched between two layers of masonry, known as wythes, and are then secured in place with grout. Grout is a fluid mixture composed of Portland cement, aggregate, and water, providing the necessary bonding agent for the steel and brick.
To fortify brick walls...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Optimal Control Theoretic Neural Optimizer: From Backpropagation to Dynamic Programming.

IEEE transactions on pattern analysis and machine intelligence·2025

Same author

Learning Risk-aware Costmaps for Traversability in Challenging Environments.

IEEE robotics and automation letters·2022

Same author

Leveraging Stochasticity for Open Loop and Model Predictive Control of Spatio-Temporal Systems.

Entropy (Basel, Switzerland)·2021

Same author

Closed-loop machine-controlled CPR system optimises haemodynamics during prolonged CPR.

Resuscitation plus·2021

Same author

Sampling-Based Nonlinear Stochastic Optimal Control for Neuromechanical Systems.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2020

Same author

Biomedical semantic indexing by deep neural network with multi-task learning.

BMC bioinformatics·2018

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 8, 2026

Curtain Flow Column: Optimization of Efficiency and Sensitivity

Curtain Flow Column: Optimization of Efficiency and Sensitivity

Published on: June 12, 2016

Efficient Reinforcement Learning via Probabilistic Trajectory Optimization.

Yunpeng Pan, George I Boutselis, Evangelos A Theodorou

IEEE Transactions on Neural Networks and Learning Systems

|July 12, 2018

Summary

This summary is machine-generated.

We introduce probabilistic differential dynamic programming (PDDP), a novel reinforcement learning method for continuous control. PDDP learns effective control policies efficiently, outperforming existing techniques in speed and data usage.

More Related Videos

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Feb 8, 2026

Curtain Flow Column: Optimization of Efficiency and Sensitivity

Curtain Flow Column: Optimization of Efficiency and Sensitivity

Published on: June 12, 2016

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

Published on: June 1, 2015

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Robotics
Machine Learning
Control Theory

Background:

Reinforcement learning (RL) in continuous spaces presents challenges for sample efficiency and policy optimization.
Model-based RL methods often rely on policy parameterization, limiting their flexibility.
Existing approaches may struggle with complex dynamics and require extensive data.

Purpose of the Study:

To develop a trajectory optimization approach for reinforcement learning in continuous state and action spaces.
To create a method that learns time-varying control policies without explicit policy parameterization.
To enhance learning speed and data efficiency in complex control tasks.

Main Methods:

Probabilistic Differential Dynamic Programming (PDDP) represents system dynamics using Gaussian processes (GPs).
Iterative local dynamic programming is performed in Gaussian belief spaces around a nominal trajectory.
The method learns policies via successive forward-backward sweeps, incorporating prior knowledge and enabling risk-sensitive learning.

Main Results:

PDDP converges globally to a stationary point under specific conditions, as shown by convergence analysis.
The framework effectively incorporates prior model knowledge to accelerate learning.
Demonstrated effectiveness and efficiency on nontrivial tasks, outperforming a state-of-the-art GP-based policy search method.

Conclusions:

PDDP offers a superior combination of learning speed, data efficiency, and applicability compared to existing methods.
The approach provides a robust framework for reinforcement learning in continuous control problems.
PDDP advances the state-of-the-art in model-based reinforcement learning for complex systems.