Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

415
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
415
Reinforcement Schedules01:24

Reinforcement Schedules

723
Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...
723
BIBO stability of continuous and discrete -time systems01:24

BIBO stability of continuous and discrete -time systems

1.1K
System stability is a fundamental concept in signal processing, often assessed using convolution. For a system to be considered bounded-input bounded-output (BIBO) stable, any bounded input signal must produce a bounded output signal. A bounded input signal is one where the modulus does not exceed a certain constant at any point in time.
To determine the BIBO stability, the convolution integral is utilized when a bounded continuous-time input is applied to a Linear Time-Invariant (LTI) system....
1.1K
Optimal Foraging00:48

Optimal Foraging

14.4K
How animals obtain and eat their food is called foraging behavior. Foraging can include searching for plants and hunting for prey and depends on the species and environment.
14.4K
Feedback control systems01:26

Feedback control systems

816
Feedback control systems are categorized in various ways based on their design, analysis, and signal types.
Linear feedback systems are theoretical models that simplify analysis and design. These systems operate under the principle that their output is directly proportional to their input within certain ranges. For instance, an amplifier in a control system behaves linearly as long as the input signal remains within a specific range. However, most physical systems exhibit inherent nonlinearity...
816
Linear time-invariant Systems01:23

Linear time-invariant Systems

1.1K
A system is linear if it displays the characteristics of homogeneity and additivity, together termed the superposition property. This principle is fundamental in all linear systems. Linear time-invariant (LTI) systems include systems with linear elements and constant parameters.
The input-output behavior of an LTI system can be fully defined by its response to an impulsive excitation at its input. Once this impulse response is known, the system's reaction to any other input can be...
1.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Hybrid Event-Triggered Tracking Control With Critic Learning for Nonlinear Networked Systems.

IEEE transactions on cybernetics·2026
Same author

Tacit mechanism: Bridging pre-training of individuality to multi-agent adversarial coordination.

Neural networks : the official journal of the International Neural Network Society·2025
Same author

Balancing State Exploration and Skill Diversity in Unsupervised Skill Discovery.

IEEE transactions on cybernetics·2025
Same author

Last-Iterate Convergence to Approximate Nash Equilibria in Multiplayer Imperfect Information Games.

IEEE transactions on neural networks and learning systems·2025
Same author

Meta Learning Task Representation in Multiagent Reinforcement Learning: From Global Inference to Local Inference.

IEEE transactions on neural networks and learning systems·2025
Same author

Plinabulin exerts an anti-proliferative effect via the PI3K/AKT/mTOR signaling pathways in glioblastoma.

Iranian journal of basic medical sciences·2025
Same journal

Hidden Data Recovery and Forecasting via Next-Generation Reservoir Computing With Multiscale Delay Selection.

IEEE transactions on neural networks and learning systems·2026
Same journal

CAFF-CIL: Causality-Aware Freedom Forgetting Approach for Class-Incremental Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Harmonic Autoencoding Framework for Multiple Tasks in Magnetic Particle Imaging Reconstruction.

IEEE transactions on neural networks and learning systems·2026
Same journal

A Survey on Human-Centric Voice-Face Multimodal Learning.

IEEE transactions on neural networks and learning systems·2026
Same journal

Vision-Assisted Foundation Model for Solving Multitask Vehicle Routing Problems.

IEEE transactions on neural networks and learning systems·2026
Same journal

FP3O: Enabling Proximal Policy Optimization in Multiagent Cooperation With Parameter-Sharing Versatility.

IEEE transactions on neural networks and learning systems·2026
See all related articles

Related Experiment Videos

MEC--a near-optimal online reinforcement learning algorithm for continuous deterministic systems.

Dongbin Zhao, Yuanheng Zhu

    IEEE Transactions on Neural Networks and Learning Systems
    |December 5, 2014
    PubMed
    Summary
    This summary is machine-generated.

    This study introduces a novel probably approximately correct (PAC) algorithm for continuous deterministic systems, offering efficient exploration and near-optimal policies without system dynamics knowledge. The algorithm demonstrates superior performance and reduced complexity compared to existing PAC methods.

    Related Experiment Videos

    Area of Science:

    • Machine Learning
    • Control Theory
    • Reinforcement Learning

    Background:

    • Continuous deterministic systems pose challenges for learning optimal control policies.
    • Existing methods often require system dynamics knowledge or are computationally intensive.
    • Efficient exploration and sample utilization are critical for effective learning in these systems.

    Purpose of the Study:

    • To propose the first probably approximately correct (PAC) algorithm for continuous deterministic systems that does not require prior system dynamics knowledge.
    • To develop an algorithm that efficiently utilizes online observed samples and balances exploration-exploitation.
    • To achieve near-optimal policies within a PAC framework with provable performance bounds.

    Main Methods:

    • State aggregation using a grid to partition the continuous state space.
    • Definition of a near-upper Q operator for generating a near-upper Q function within each state cell.
    • Implementation of a greedy policy that balances exploration and exploitation.
    • Rigorous analysis to establish polynomial time bounds for non-optimal actions.

    Main Results:

    • The proposed algorithm achieves a polynomial time bound for executing non-optimal actions.
    • The algorithm converges to a near-optimal policy in finite steps under the PAC framework.
    • The implementation requires no system dynamics knowledge and exhibits lower computational complexity.
    • Simulation studies indicate superior performance compared to other similar PAC algorithms.

    Conclusions:

    • The developed PAC algorithm offers an effective and efficient approach for learning control policies in continuous deterministic systems.
    • The method's independence from system dynamics and reduced complexity make it broadly applicable.
    • The algorithm provides a strong theoretical guarantee of convergence to near-optimal solutions.