Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Region of Convergence

Region of Convergence

The z-transform is a powerful mathematical tool used in the analysis of discrete-time signals and systems. It is a crucial tool in the analysis of discrete-time systems, but its convergence is limited to specific values of the complex variable z. This range of values, known as the Region of Convergence (ROC), is fundamental in determining the behavior and stability of a system or signal. The ROC defines the region in the complex plane where the z-transform converges, which can take various...

BIBO stability of continuous and discrete -time systems

BIBO stability of continuous and discrete -time systems

System stability is a fundamental concept in signal processing, often assessed using convolution. For a system to be considered bounded-input bounded-output (BIBO) stable, any bounded input signal must produce a bounded output signal. A bounded input signal is one where the modulus does not exceed a certain constant at any point in time.
To determine the BIBO stability, the convolution integral is utilized when a bounded continuous-time input is applied to a Linear Time-Invariant (LTI) system....

Region of Convergence of Laplace Tarnsform

Region of Convergence of Laplace Tarnsform

The Region of Convergence (ROC) is a fundamental concept in signal processing and system analysis, particularly associated with the Laplace transform. The ROC represents an area in the complex plane where the Laplace transform of a given signal converges, determining the transform's applicability and utility.
Consider a decaying exponential signal that begins at a specific time. When deriving its Laplace transform, the time-domain variable is replaced with a complex variable. This...

Logarithmic Differentiation

Logarithmic Differentiation

When a car’s weight and driving forces act on a tire, they impose an external load on the rubber material. This load is resisted internally by forces distributed throughout the tire structure, which are defined as stress. The resulting deformation of the rubber due to this stress is quantified as strain. The relationship between stress and strain governs how the tire deforms under load and is central to understanding its mechanical response during operation.Rubber exhibits a nonlinear...

Convolution: Math, Graphics, and Discrete Signals

Convolution: Math, Graphics, and Discrete Signals

In any LTI (Linear Time-Invariant) system, the convolution of two signals is denoted using a convolution operator, assuming all initial conditions are zero. The convolution integral can be divided into two parts: the zero-input or natural response and the zero-state or forced response, with t0 indicating the initial time.
To simplify the convolution integral, it is assumed that both the input signal and impulse response are zero for negative time values. The graphical convolution process...

Central Limit Theorem

Central Limit Theorem

The central limit theorem, abbreviated as clt, is one of the most powerful and useful ideas in all of statistics. The central limit theorem for sample means says that if you repeatedly draw samples of a given size and calculate their means, and create a histogram of those means, then the resulting histogram will tend to have an approximate normal bell shape. In other words, as sample sizes increase, the distribution of means follows the normal distribution more closely.
The sample size, n, that...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

2D Amorphous MoO<sub>3-x</sub>/Ti<sub>3</sub>C<sub>2</sub>T<sub>x</sub> MXene Heterostructure: Interface Charge Transfer-Induced Carbon Defect-Driven Enhancement of Ferromagnetism.

Small (Weinheim an der Bergstrasse, Germany)·2026

Same author

Adaptive Learning Control of Uncertain Systems via Weight and Intrinsic Plasticity-Based Neural Networks.

IEEE transactions on neural networks and learning systems·2026

Same author

Ultrasound-Activatable Piezoelectric Hydrogel Reprograms Mitochondrial Epigenetics for Osteoarthritis Therapy via the mTOR/GATD3A Axis.

Advanced science (Weinheim, Baden-Wurttemberg, Germany)·2026

Same author

Air Instability-Induced Mechanical Degradation Plaguing Cyclability of Polycrystalline Nickel-Rich Layered Cathode.

Journal of the American Chemical Society·2026

Same author

Immunoengineering in the field of tendon and bone regeneration: immunomodulatory biomaterials, delivery platforms, and preclinical models for chronic diseases.

Frontiers in bioengineering and biotechnology·2026

Same author

Leucyl tRNA synthetase ameliorates cholestatic liver injury by inhibiting integrated stress response in mice.

Scientific reports·2026

Same journal

Relaxed Stability Conditions for Model Predictive Control of Hybrid Dynamical Systems Using Hybrid Recurrent Neural Networks.

IEEE transactions on cybernetics·2026

Same journal

An Evolutionary Algorithm Assisted by an Ensemble of Pareto-Optimal Surrogate Models.

IEEE transactions on cybernetics·2026

Same journal

A Quantum Self-Attention Neural Network Model on Quantum Circuits.

IEEE transactions on cybernetics·2026

Same journal

Semi-Explicit Solution of Some Discrete-Time Higher-Order-Cost Mean-Field-Type Control.

IEEE transactions on cybernetics·2026

Same journal

A Novel One-Step Small Object Detector for Autonomous Aerial Vehicles.

IEEE transactions on cybernetics·2026

Same journal

Online Data-Driven-Based Optimal Output Tracking Control Without Initial Stabilizing Policy.

IEEE transactions on cybernetics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 22, 2026

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

Discrete-Time Deterministic $Q$ -Learning: A Novel Convergence Analysis.

Qinglai Wei, Frank L Lewis, Qiuye Sun

IEEE Transactions on Cybernetics

|April 20, 2016

Summary

This summary is machine-generated.

A novel discrete-time deterministic Q-learning algorithm updates the Q-function across all states and controls, simplifying convergence criteria for optimal control. This method enhances reinforcement learning performance.

Related Experiment Videos

Last Updated: Mar 22, 2026

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

WheelCon: A Wheel Control-Based Gaming Platform for Studying Human Sensorimotor Control

Published on: August 15, 2020

Area of Science:

Artificial Intelligence
Machine Learning
Control Theory

Background:

Traditional Q-learning algorithms update the Q-function for single state-control pairs.
Existing convergence criteria for Q-learning can be complex and computationally intensive.

Purpose of the Study:

To develop a novel discrete-time deterministic Q-learning algorithm with enhanced convergence properties.
To simplify the convergence criterion for deterministic Q-learning algorithms.
To facilitate the implementation of the algorithm using neural networks.

Main Methods:

The developed algorithm updates the iterative Q-function for all state and control spaces in each iteration.
A new convergence criterion is established by analyzing the upper and lower bounds of the iterative Q-function.
Convergence properties are analyzed for both undiscounted and discounted cases.
Neural networks are employed for Q-function approximation and control law computation.

Main Results:

The novel algorithm guarantees convergence of the iterative Q-function to the optimum.
A simplified convergence criterion is established, improving upon traditional methods.
Simulation results demonstrate the effectiveness and performance of the developed algorithm compared to existing approaches.

Conclusions:

The proposed discrete-time deterministic Q-learning algorithm offers an efficient and effective approach to reinforcement learning.
The simplified convergence criterion and neural network implementation facilitate practical application in complex control systems.