Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Observational Learning

Observational Learning

Albert Bandura's observational learning, also known as imitation or modeling, occurs when a person observes and imitates another's behavior. It is a quicker process than operant conditioning. A well-known example is the Bobo doll study, where children who saw an adult acting aggressively towards the doll were more likely to act aggressively when left alone, compared to those who observed a nonaggressive adult. Many psychologists view observational learning as a form of latent learning...

Statically Indeterminate Problem Solving

Statically Indeterminate Problem Solving

Statically indeterminate problems are those where statics alone can not determine the internal forces or reactions. Consider a structure comprising two cylindrical rods made of steel and brass. These rods are joined at point B and restrained by rigid supports at points A and C. Now, the reactions at points A and C and the deflection at point B are to be determined. This rod structure is classified as statically indeterminate as the structure has more supports than are necessary for maintaining...

Avoidance Learning and Learned Helplessness

Avoidance Learning and Learned Helplessness

Avoidance learning and learned helplessness are critical concepts in understanding behavioral responses to negative stimuli.
Avoidance learning occurs when an organism learns that a specific behavior can prevent an unpleasant outcome. For example, a student who receives a bad grade may start studying harder to avoid future poor grades. This behavior persists even when the negative outcome is no longer present. Avoidance learning is powerful because it maintains behavior in the absence of the...

Associative Learning

Associative Learning

Associative learning is a fundamental concept in behavioral psychology, wherein a connection is established between two stimuli or events, leading to a learned response. This process is critical in understanding how behaviors are acquired and modified. Conditioning, the mechanism through which associations are formed, can be divided into two main types: classical conditioning and operant conditioning, each elucidating different aspects of associative learning.
Classical conditioning, also known...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Dynamic Equilibrium

Dynamic Equilibrium

A reversible chemical reaction represents a chemical process that proceeds in both forward (left to right) and reverse (right to left) directions. When the rates of the forward and reverse reactions are equal, the concentrations of the reactant and product species remain constant over time and the system is at equilibrium. A special double arrow is used to emphasize the reversible nature of the reaction. The relative concentrations of reactants and products in equilibrium systems vary greatly;...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Event-triggered fuzzy logic control for an uncertain robot with coupled output constraints.

ISA transactions·2026

Same author

Window-to-window BEV representation learning for limited FoV cross-view geo-localization.

Neural networks : the official journal of the International Neural Network Society·2026

Same author

ImagineNav++: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination.

IEEE transactions on pattern analysis and machine intelligence·2026

Same author

Nash Equilibrium Strategies for Multicluster Pursuit-Evasion Game With Disturbances: A Prescribed-Time Convergence Approach.

IEEE transactions on cybernetics·2026

Same author

Practical Prescribed-Time Cooperative Path Following of Underactuated Multi-ASVs Without Velocity Measurements via Intermittent Control.

IEEE transactions on cybernetics·2026

Same author

A modern look at simplicity bias in image classification tasks.

Neural networks : the official journal of the International Neural Network Society·2026

Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026

Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026

Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026

Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026

Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 11, 2025

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

A Parallel Framework of Adaptive Dynamic Programming Algorithm With Off-Policy Learning.

Changyin Sun, Xiaofeng Li, Yuewen Sun

IEEE Transactions on Neural Networks and Learning Systems

|August 25, 2020

Summary

This summary is machine-generated.

This study introduces a model-free adaptive dynamic programming (ADP) method for optimal control of nonaffine nonlinear systems. The approach enhances data collection and exploration using parallel agents, ensuring system stability and convergence to the Hamilton-Jacobi-Bellman equation solution.

More Related Videos

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Published on: October 28, 2022

Related Experiment Videos

Last Updated: Dec 11, 2025

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Real-Time Proxy-Control of Re-Parameterized Peripheral Signals using a Close-Loop Interface

Published on: May 8, 2021

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Design and Application of a Fault Detection Method Based on Adaptive Filters and Rotational Speed Estimation for an Electro-Hydrostatic Actuator

Published on: October 28, 2022

Area of Science:

Control Theory
Machine Learning
Nonlinear Systems

Background:

Optimal control of nonaffine nonlinear systems presents significant challenges.
Existing model-free methods often struggle with data efficiency and exploration limitations.

Purpose of the Study:

To develop a novel model-free online adaptive dynamic programming (ADP) approach for optimal control.
To enhance data collection and exploration capabilities for improved learning.
To guarantee system stability and convergence for the proposed control laws.

Main Methods:

Utilized an off-policy learning mechanism combined with a parallel paradigm employing multithread agents.
Implemented an actor-critic (AC) structure with two neural networks (NNs) for Q-function and policy approximation.
Employed a policy gradient method for a single-step policy improvement after policy evaluation.

Main Results:

Significantly augmented sampled data through parallel agent interaction and diverse initial states.
Demonstrated guaranteed system stability under iterative control laws.
Provided convergence analysis proving the Q-function's monotonic non-increasing convergence to the Hamilton-Jacobi-Bellman (HJB) equation solution.
Verified the algorithm's effectiveness through two numerical examples.

Conclusions:

The proposed model-free online ADP approach effectively solves optimal control problems for nonaffine nonlinear systems.
The parallel learning and exploration strategy enhances data efficiency and robustness.
The actor-critic implementation with neural networks provides a practical framework for the developed algorithm.