Model-Free Algorithms for Cooperative Output Regulation of Discrete-Time Multiagent Systems via Q-Learning Method
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.
View abstract on PubMed
Summary
This summary is machine-generated.A novel model-free Q-learning algorithm enables cooperative output regulation for multiagent systems with unknown parameters. This data-driven approach ensures policy stability and avoids system model requirements.
Area Of Science
- Control Systems Engineering
- Artificial Intelligence
- Robotics
Background
- Cooperative output regulation is crucial for multiagent systems.
- Unknown system parameters pose a significant challenge in practical applications.
- Existing methods often require complete system models, limiting their applicability.
Purpose Of The Study
- To develop a model-free Q-learning algorithm for cooperative output regulation in discrete-time multiagent systems.
- To address the challenge of unknown system parameters.
- To ensure policy stability and convergence in learning algorithms.
Main Methods
- A model-free Q-learning algorithm is proposed, operating independently of system parameters.
- An immediate cost formulation eliminates the need for solving regulator equations.
- A data-driven algorithm is introduced to compute initial stable gains for unstable initial policies.
- The stability of algorithm iterations and a unique Q-function matrix condition are formally derived.
Main Results
- The proposed Q-learning algorithm achieves a streamlined structure for direct optimal policy determination.
- Formal stability analysis confirms the convergence of each algorithm iteration.
- The data-driven approach successfully ensures convergence to stability even with unstable initial policies.
- Demonstration that distributed observers and excitation noise do not introduce bias.
Conclusions
- The model-free Q-learning approach offers an effective solution for cooperative output regulation in multiagent systems with unknown parameters.
- The developed algorithm ensures stability and convergence, enhancing practical applicability.
- Simulation examples validate the efficacy and robustness of the proposed method.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.

