Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

A Generalization Error for Q-Learning.

Susan A Murphy1

  • 1Department of Statistics, University of Michigan, Ann Arbor, MI 48109-1107, USA.

Journal of Machine Learning Research : JMLR
|June 10, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Is More Always Better With Digital Health Interventions? Shifting Engagement From Maximizing Use to Supporting Health.

Mayo Clinic proceedings. Digital health·2026
Same author

Effective monitoring of online AI decision-making algorithms in just-in-time adaptive interventions.

NPJ digital medicine·2026
Same author

SigmaScheduling: Uncertainty-Informed Scheduling of Decision Points for Intelligent Mobile Health Interventions.

... International Conference on Wearable and Implantable Body Sensor Networks. International Conference on Wearable and Implantable Body Sensor Networks·2026
Same author

Non-Stationary Latent Auto-Regressive Bandits.

Reinforcement learning journal·2026
Same author

Harnessing Causality in Reinforcement Learning With Bagged Decision Times.

Proceedings of machine learning research·2026
Same author

Digital Twins for Just-in-Time Adaptive Interventions (JITAIs): Framework for Optimizing and Continually Improving JITAIs.

Journal of medical Internet research·2026
Same journal

Classification Under Local Differential Privacy with Model Reversal and Model Averaging.

Journal of machine learning research : JMLR·2026
Same journal

Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.

Journal of machine learning research : JMLR·2026
Same journal

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026
Same journal

Unsupervised Tree Boosting for Learning Probability Distributions.

Journal of machine learning research : JMLR·2026
Same journal

A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations.

Journal of machine learning research : JMLR·2026
Same journal

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes.

Journal of machine learning research : JMLR·2026
See all related articles

This study analyzes Q-learning with function approximation for policy learning from single datasets. It establishes an upper bound on generalization error, crucial for effective decision-making in complex systems.

Area of Science:

  • Machine Learning
  • Reinforcement Learning
  • Computational Social Science
  • Medical Informatics

Background:

  • Policy learning from finite horizon trajectories is vital in social science and medicine.
  • Q-learning with function approximation is a common approach for such problems.
  • Understanding generalization error is key to reliable policy learning.

Purpose of the Study:

  • To derive an upper bound on the generalization error for Q-learning with function approximation.
  • To analyze the factors influencing generalization error in single-dataset policy learning.
  • To provide theoretical insights into the performance of Q-learning in complex planning problems.

Main Methods:

  • Utilized Q-learning with function approximation.

Related Experiment Videos

  • Derived a novel upper bound on generalization error.
  • Analyzed the bound in terms of algorithm-specific quantities, approximation space complexity, and model mismatch.
  • Main Results:

    • An upper bound on generalization error was successfully derived.
    • The bound quantifies the impact of approximation space complexity and Q-learning's inherent limitations.
    • Identified key factors contributing to generalization error in this setting.

    Conclusions:

    • The derived upper bound offers a theoretical understanding of Q-learning's generalization performance.
    • This work contributes to the development of more robust and accurate policy learning algorithms.
    • Findings are applicable to planning problems in diverse fields like social science and medicine.