Q-Learning Approach to Finite-Horizon H ∞ Tracking With Partial Observation | JoVE Visualize

Area of Science:

Control Theory
Reinforcement Learning
Game Theory

Background:

Existing reinforcement learning (RL) methods often require full state information and are limited to infinite-horizon, time-invariant systems.
Finite-horizon control with partial observations and unknown dynamics presents significant challenges, including the need for time-varying Riccati equations.
Model-free approaches are desirable for systems where dynamics are unknown, relying solely on input-output data.

Purpose of the Study:

To investigate the finite-horizon H-infinity tracking control problem for discrete-time linear systems with partial observations and unknown dynamics.
To develop model-free reinforcement learning algorithms that overcome limitations of existing approaches, particularly regarding state information and system horizon.
To provide a framework for solving time-varying control problems without requiring an initially admissible policy or discount factor.

Main Methods:

Reconstruction of system state from historical input-output trajectories to create a data-driven system representation.
Definition of a time-varying Q-function based on input-output data.
Proposal of two minimax Q-learning algorithms designed for model-free, data-driven control.

Main Results:

The developed algorithms successfully reconstruct system states and define input-output-based Q-functions.
The minimax Q-learning algorithms do not require an initially admissible policy and avoid discount factors, enhancing stability guarantees.
The framework demonstrates extensibility to both infinite-horizon and time-varying systems without structural changes.

Conclusions:

The proposed data-driven, model-free reinforcement learning algorithms effectively address the finite-horizon H-infinity tracking control problem for discrete-time systems with partial observations.
Theoretical convergence is proven, and simulation results validate the algorithms' effectiveness.
This work offers a significant advancement in reinforcement learning for control, particularly for systems with unknown dynamics and partial state information.