Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Orthogonal Trajectories01:26

Orthogonal Trajectories

71
Orthogonal trajectories describe the geometric relationship between two families of curves that intersect each other at right angles. One illustrative case involves a family of parabolas that open sideways along the x-axis. These curves share a common shape but differ by a scaling parameter, resulting in a set of curves that all pass through the origin and widen at different rates.Determining Orthogonal TrajectoriesTo identify the orthogonal trajectories for these parabolas, the first step...
71
Reinforcement01:23

Reinforcement

933
Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:
933
Reinforcements in Concrete01:25

Reinforcements in Concrete

476
Reinforced concrete is a composite material used extensively in construction, combining the compressive strength of concrete with the tensile strength of steel. This synergy is essential as concrete, while excellent at resisting compression, is weak under tension. Steel bars, or rebars, are embedded in the concrete to handle these tensile forces. The choice of steel is strategic; it shares a similar coefficient of thermal expansion with concrete, which ensures uniformity in response to...
476
Corrosion of Reinforcement01:27

Corrosion of Reinforcement

584
The corrosion of steel reinforcement within concrete is a process influenced by the material's inherent properties and external factors. The high pH level of around 13, provided by calcium hydroxide present in concrete, initially protects the steel reinforcement by promoting the formation of a passive iron oxide layer on its surface.
However, over time and under certain conditions like carbonation, chloride ingress, and cracking this protective state can be compromised. Steel has areas with...
584
Reinforcement Schedules01:24

Reinforcement Schedules

509
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
509
Reinforced Brick Masonry01:15

Reinforced Brick Masonry

1.7K
Reinforced brick masonry is an advanced construction technique that enhances the structural integrity of brick walls by incorporating steel reinforcements. These reinforcements are either placed within the hollow cores of bricks or sandwiched between two layers of masonry, known as wythes, and are then secured in place with grout. Grout is a fluid mixture composed of Portland cement, aggregate, and water, providing the necessary bonding agent for the steel and brick.
To fortify brick walls...
1.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Optimal Control Theoretic Neural Optimizer: From Backpropagation to Dynamic Programming.

IEEE transactions on pattern analysis and machine intelligence·2025
Same author

Learning Risk-aware Costmaps for Traversability in Challenging Environments.

IEEE robotics and automation letters·2022
Same author

Leveraging Stochasticity for Open Loop and Model Predictive Control of Spatio-Temporal Systems.

Entropy (Basel, Switzerland)·2021
Same author

Closed-loop machine-controlled CPR system optimises haemodynamics during prolonged CPR.

Resuscitation plus·2021
Same author

Sampling-Based Nonlinear Stochastic Optimal Control for Neuromechanical Systems.

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference·2020
Same author

Biomedical semantic indexing by deep neural network with multi-task learning.

BMC bioinformatics·2018
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Video

Updated: Feb 8, 2026

Curtain Flow Column: Optimization of Efficiency and Sensitivity
06:44

Curtain Flow Column: Optimization of Efficiency and Sensitivity

Published on: June 12, 2016

7.0K

Efficient Reinforcement Learning via Probabilistic Trajectory Optimization.

Yunpeng Pan, George I Boutselis, Evangelos A Theodorou

    IEEE Transactions on Neural Networks and Learning Systems
    |July 12, 2018
    PubMed
    Summary
    This summary is machine-generated.

    We introduce probabilistic differential dynamic programming (PDDP), a novel reinforcement learning method for continuous control. PDDP learns effective control policies efficiently, outperforming existing techniques in speed and data usage.

    More Related Videos

    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task
    11:18

    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

    Published on: June 1, 2015

    11.2K
    A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
    12:18

    A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

    Published on: January 11, 2020

    8.1K

    Related Experiment Videos

    Last Updated: Feb 8, 2026

    Curtain Flow Column: Optimization of Efficiency and Sensitivity
    06:44

    Curtain Flow Column: Optimization of Efficiency and Sensitivity

    Published on: June 12, 2016

    7.0K
    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task
    11:18

    Quantifying Learning in Young Infants: Tracking Leg Actions During a Discovery-learning Task

    Published on: June 1, 2015

    11.2K
    A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
    12:18

    A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

    Published on: January 11, 2020

    8.1K

    Area of Science:

    • Robotics
    • Machine Learning
    • Control Theory

    Background:

    • Reinforcement learning (RL) in continuous spaces presents challenges for sample efficiency and policy optimization.
    • Model-based RL methods often rely on policy parameterization, limiting their flexibility.
    • Existing approaches may struggle with complex dynamics and require extensive data.

    Purpose of the Study:

    • To develop a trajectory optimization approach for reinforcement learning in continuous state and action spaces.
    • To create a method that learns time-varying control policies without explicit policy parameterization.
    • To enhance learning speed and data efficiency in complex control tasks.

    Main Methods:

    • Probabilistic Differential Dynamic Programming (PDDP) represents system dynamics using Gaussian processes (GPs).
    • Iterative local dynamic programming is performed in Gaussian belief spaces around a nominal trajectory.
    • The method learns policies via successive forward-backward sweeps, incorporating prior knowledge and enabling risk-sensitive learning.

    Main Results:

    • PDDP converges globally to a stationary point under specific conditions, as shown by convergence analysis.
    • The framework effectively incorporates prior model knowledge to accelerate learning.
    • Demonstrated effectiveness and efficiency on nontrivial tasks, outperforming a state-of-the-art GP-based policy search method.

    Conclusions:

    • PDDP offers a superior combination of learning speed, data efficiency, and applicability compared to existing methods.
    • The approach provides a robust framework for reinforcement learning in continuous control problems.
    • PDDP advances the state-of-the-art in model-based reinforcement learning for complex systems.