Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

A Generalization Error for Q-Learning.

Susan A Murphy¹

¹Department of Statistics, University of Michigan, Ann Arbor, MI 48109-1107, USA.

Journal of Machine Learning Research : JMLR

|June 10, 2006

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Is More Always Better With Digital Health Interventions? Shifting Engagement From Maximizing Use to Supporting Health.

Mayo Clinic proceedings. Digital health·2026

Same author

Effective monitoring of online AI decision-making algorithms in just-in-time adaptive interventions.

NPJ digital medicine·2026

Same author

SigmaScheduling: Uncertainty-Informed Scheduling of Decision Points for Intelligent Mobile Health Interventions.

... International Conference on Wearable and Implantable Body Sensor Networks. International Conference on Wearable and Implantable Body Sensor Networks·2026

Same author

Non-Stationary Latent Auto-Regressive Bandits.

Reinforcement learning journal·2026

Same author

Harnessing Causality in Reinforcement Learning With Bagged Decision Times.

Proceedings of machine learning research·2026

Same author

Digital Twins for Just-in-Time Adaptive Interventions (JITAIs): Framework for Optimizing and Continually Improving JITAIs.

Journal of medical Internet research·2026

Same journal

Classification Under Local Differential Privacy with Model Reversal and Model Averaging.

Journal of machine learning research : JMLR·2026

Same journal

Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.

Journal of machine learning research : JMLR·2026

Same journal

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026

Same journal

Unsupervised Tree Boosting for Learning Probability Distributions.

Journal of machine learning research : JMLR·2026

Same journal

A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations.

Journal of machine learning research : JMLR·2026

Same journal

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes.

Journal of machine learning research : JMLR·2026

See all related articles

This study analyzes Q-learning with function approximation for policy learning from single datasets. It establishes an upper bound on generalization error, crucial for effective decision-making in complex systems.

Area of Science:

Machine Learning
Reinforcement Learning
Computational Social Science
Medical Informatics

Background:

Policy learning from finite horizon trajectories is vital in social science and medicine.
Q-learning with function approximation is a common approach for such problems.
Understanding generalization error is key to reliable policy learning.

Purpose of the Study:

To derive an upper bound on the generalization error for Q-learning with function approximation.
To analyze the factors influencing generalization error in single-dataset policy learning.
To provide theoretical insights into the performance of Q-learning in complex planning problems.

Main Methods:

Utilized Q-learning with function approximation.

Related Experiment Videos

Derived a novel upper bound on generalization error.

Analyzed the bound in terms of algorithm-specific quantities, approximation space complexity, and model mismatch.

Main Results:

An upper bound on generalization error was successfully derived.
The bound quantifies the impact of approximation space complexity and Q-learning's inherent limitations.
Identified key factors contributing to generalization error in this setting.

Conclusions:

The derived upper bound offers a theoretical understanding of Q-learning's generalization performance.
This work contributes to the development of more robust and accurate policy learning algorithms.
Findings are applicable to planning problems in diverse fields like social science and medicine.