Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

State Space Representation

State Space Representation

The frequency-domain technique, commonly used in analyzing and designing feedback control systems, is effective for linear, time-invariant systems. However, it falls short when dealing with nonlinear, time-varying, and multiple-input multiple-output systems. The time-domain or state-space approach addresses these limitations by utilizing state variables to construct simultaneous, first-order differential equations, known as state equations, for an nth-order system.
Consider an RLC circuit, a...

Linear Approximation in Time Domain

Linear Approximation in Time Domain

Nonlinear systems often require sophisticated approaches for accurate modeling and analysis, with state-space representation being particularly effective. This method is especially useful for systems where variables and parameters vary with time or operating conditions, such as in a simple pendulum or a translational mechanical system with nonlinear springs.
For a simple pendulum with a mass evenly distributed along its length and the center of mass located at half the pendulum's length, the...

Transfer Function to State Space

Transfer Function to State Space

State-space representation is a powerful tool for simulating physical systems on digital computers, necessitating the conversion of the transfer function into state-space form. Consider an nth-order linear differential equation with constant coefficients, like those encountered in an RLC circuit. The state variables are selected as the output and its n−1 derivatives. Differentiating these variables and substituting them back into the original equation produces the state equations.
In an RLC...

Statically Indeterminate Problem Solving

Statically Indeterminate Problem Solving

Statically indeterminate problems are those where statics alone can not determine the internal forces or reactions. Consider a structure comprising two cylindrical rods made of steel and brass. These rods are joined at point B and restrained by rigid supports at points A and C. Now, the reactions at points A and C and the deflection at point B are to be determined. This rod structure is classified as statically indeterminate as the structure has more supports than are necessary for maintaining...

State Space to Transfer Function

State Space to Transfer Function

The conversion of state-space representation to a transfer function is a fundamental process in system analysis. It provides a method for transitioning from a time-domain description to a frequency-domain representation, which is crucial for simplifying the analysis and design of control systems.
The transformation process begins with the state-space representation, characterized by the state equation and the output equation. These equations are typically represented as:

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Quantifying Deep-Level Defects-Dominated Degradation for Commercially Viable Perovskite Solar Cells.

Advanced materials (Deerfield Beach, Fla.)·2026

Same author

Reinforcement learning in linear embedding space unlocks generalizable control across soft robot configurations.

Nature communications·2026

Same author

A Multi-Context Regulome-Wide Association Atlas for Genetic Studies of Aging Brain Disorders.

medRxiv : the preprint server for health sciences·2026

Same author

Dual controllability de-differentiation of functional brain networks in major depressive disorder: Insights from large-scale neuroimaging and transcriptomic integration.

Journal of affective disorders·2026

Same author

Strength-Toughness-Wear Coupling Mechanisms of Low-Carbon Martensitic Wear-Resistant Steel Enabled by Ti/Nb Microalloying-Driven Carbide Precipitation and Synergistic Regulation of Tempered Microstructures.

Materials (Basel, Switzerland)·2026

Same author

Phytochemical profiling and molecular interactions of Parishins a and C as potent AChE inhibitors from red Gastrodia elata.

Biophysical chemistry·2026

Same journal

Universal perceptron and DNA-like learning algorithm for binary neural networks: LSBF and PBF implementations.

IEEE transactions on neural networks·2013

Same journal

Guest editorial: special section on white box nonlinear prediction models.

IEEE transactions on neural networks·2011

Same journal

Data-based fault-tolerant control of high-speed trains with traction/braking notch nonlinearities and actuator failures.

IEEE transactions on neural networks·2011

Same journal

Guest editorial: special section on data-based control, modeling, and optimization.

IEEE transactions on neural networks·2011

Same journal

Neural network-based multiple robot simultaneous localization and mapping.

IEEE transactions on neural networks·2011

Same journal

Data-driven model-free adaptive control for a class of MIMO nonlinear discrete-time systems.

IEEE transactions on neural networks·2011

See all related articles

Search research articles

Related Experiment Videos

Hierarchical approximate policy iteration with binary-tree state space decomposition.

Xin Xu¹, Chunming Liu, Simon X Yang

¹College of Mechatronics and Automation, National University of Defense Technology, Changsha 410073, China. xinxu@nudt.edu.cn

IEEE Transactions on Neural Networks

|October 13, 2011

Summary

This summary is machine-generated.

This study introduces hierarchical approximate policy iteration (HAPI) to improve reinforcement learning for complex problems. HAPI achieves better policies by decomposing state spaces, outperforming existing methods like LSPI.

Related Experiment Videos

Area of Science:

Artificial Intelligence
Machine Learning
Robotics

Background:

Approximate policy iteration (API) methods like LSPI struggle with large or continuous state spaces in reinforcement learning (RL).
Existing API algorithms face challenges in achieving near-optimal policies for complex Markov decision processes (MDPs).

Purpose of the Study:

To present a novel hierarchical approximate policy iteration (HAPI) method for RL in absorbing MDPs.
To address the limitations of current API algorithms in handling large and continuous state spaces.
To improve policy approximation accuracy and reduce computational complexity.

Main Methods:

Developed a hierarchical API (HAPI) approach utilizing binary-tree state space decomposition.
Employed a learning-based decomposition strategy for adaptive sample collection and state space partitioning.
Applied API algorithms to sub-MDPs for approximating local optimal policies within the decomposed structure.

Main Results:

The HAPI method successfully decomposed MDPs into a binary-tree structure of absorbing sub-MDPs.
Local near-optimal policies were approximated with reduced complexity and enhanced precision.
The combined global policy from HAPI demonstrated superior performance compared to single API algorithms.
Evaluated HAPI on three learning control problems, including mobile robot path-tracking, showing improved results over LSPI and KLSPI.

Conclusions:

HAPI offers a more effective approach for RL in absorbing MDPs with large state spaces.
The hierarchical decomposition strategy enhances policy quality and computational efficiency.
HAPI provides a significant advancement over traditional API methods for time-optimal learning control tasks.