Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reinforcement

Reinforcement

Positive and negative reinforcement are key concepts in operant conditioning, a learning process where the consequences of a behavior affect the likelihood of that behavior being repeated.
Positive reinforcement occurs when a behavior is followed by the presentation of a rewarding stimulus, increasing the frequency of that behavior. For example:

Corrosion of Reinforcement

Corrosion of Reinforcement

The corrosion of steel reinforcement within concrete is a process influenced by the material's inherent properties and external factors. The high pH level of around 13, provided by calcium hydroxide present in concrete, initially protects the steel reinforcement by promoting the formation of a passive iron oxide layer on its surface.
However, over time and under certain conditions like carbonation, chloride ingress, and cracking this protective state can be compromised. Steel has areas with...

Reinforcement Schedules

Reinforcement Schedules

Positive reinforcement is a powerful method for teaching new behaviors to both animals and humans. B.F. Skinner demonstrated this with his experiments using rats in a Skinner box. When a rat pressed a lever, it received a food pellet. This immediate reward encouraged the rat to repeat the behavior. This method, where a reward follows every instance of the behavior, is known as continuous reinforcement. It is highly effective for establishing new behaviors quickly.
Once a behavior is learned,...

Reinforcements in Concrete

Reinforcements in Concrete

Reinforced concrete is a composite material used extensively in construction, combining the compressive strength of concrete with the tensile strength of steel. This synergy is essential as concrete, while excellent at resisting compression, is weak under tension. Steel bars, or rebars, are embedded in the concrete to handle these tensile forces. The choice of steel is strategic; it shares a similar coefficient of thermal expansion with concrete, which ensures uniformity in response to...

What are Estimates?

What are Estimates?

It isn't easy to measure a parameter such as the mean height or the mean weight of a population. So, we draw samples from the population and calculate the mean height or mean weight of the individuals in the sample. This sample data acts as a representative measure of the population parameter. These sample statistics are known as estimates.
The estimate for the mean of a sample is denoted by ͞x, whereas the mean of the population is designated as μ. Further, parameters such...

Fiber Reinforced Concrete

Fiber Reinforced Concrete

Fiber-reinforced concrete significantly enhances the structural and nonstructural properties of traditional concrete by incorporating fibers like steel, glass, and polymers. These fibers, varying from natural ones such as sisal and cellulose to manufactured ones like polypropylene and Kevlar, are mixed into hydraulic cement with aggregates. Steel fibers, often preferred for their robustness, contribute to improved ductility, toughness, and post-cracking performance. The concrete is classified...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Prior exposure to advanced therapy and timing of discontinuation and risk of serious infections in patients with inflammatory bowel disease initiating a new advanced therapy.

Journal of Crohn's & colitis·2026

Same author

The impact of normalization of histology on disease course in microscopic colitis: a retrospective cohort study.

Inflammatory bowel diseases·2026

Same author

False discovery rate control for grouped hypotheses: application to miRNAome data.

PeerJ·2026

Same author

Impact of advanced therapy initiation on antibiotic dependence in chronic pouchitis.

Inflammatory bowel diseases·2026

Same author

Inflammatory bowel disease phenotypes in diverse populations: a global comparative analysis.

Journal of Crohn's & colitis·2026

Same author

Microbiome-Directed Therapy for Fatigue in Quiescent Inflammatory Bowel Disease: A Randomized Placebo-Controlled Trial of Multistrain Probiotic Supplementation.

The American journal of gastroenterology·2026

Same journal

Classification Under Local Differential Privacy with Model Reversal and Model Averaging.

Journal of machine learning research : JMLR·2026

Same journal

Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.

Journal of machine learning research : JMLR·2026

Same journal

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026

Same journal

Unsupervised Tree Boosting for Learning Probability Distributions.

Journal of machine learning research : JMLR·2026

Same journal

A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations.

Journal of machine learning research : JMLR·2026

Same journal

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes.

Journal of machine learning research : JMLR·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 30, 2026

Supervised Machine Learning for Semi-Quantification of Extracellular DNA in Glomerulonephritis

Supervised Machine Learning for Semi-Quantification of Extracellular DNA in Glomerulonephritis

Published on: June 18, 2020

Semi-Supervised Off-Policy Reinforcement Learning and Value Estimation for Dynamic Treatment Regimes.

Aaron Sonabend-W¹, Nilanjana Laha², Ashwin N Ananthakrishnan³

¹Department of Biostatistics, Harvard University, Boston, USA.

Journal of Machine Learning Research : JMLR

|January 29, 2026

Summary

This summary is machine-generated.

This study introduces a semi-supervised learning (SSL) method to improve reinforcement learning (RL) for dynamic treatment regimes. It efficiently uses limited labeled data and extensive unlabeled data to estimate patient outcomes from clinical notes.

Keywords:

Q-learning doubly robust value function dynamical treatment regime off-policy learning reinforcement-learning semi-supervised learning

More Related Videos

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Published on: October 24, 2012

Related Experiment Videos

Last Updated: Jan 30, 2026

Supervised Machine Learning for Semi-Quantification of Extracellular DNA in Glomerulonephritis

Supervised Machine Learning for Semi-Quantification of Extracellular DNA in Glomerulonephritis

Published on: June 18, 2020

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Analyzing Mitochondrial Morphology Through Simulation Supervised Learning

Published on: March 3, 2023

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Mapping Cortical Dynamics Using Simultaneous MEG/EEG and Anatomically-constrained Minimum-norm Estimates: an Auditory Attention Example

Published on: October 24, 2012

Area of Science:

* Computational statistics and machine learning applied to healthcare.
* Development of advanced algorithms for personalized medicine.

Background:

* Reinforcement learning (RL) shows potential for dynamic treatment regimes, but relies on accurate health outcome data.
* Clinical notes often contain outcome information, but manual extraction is resource-intensive, leading to small labeled datasets.
* Existing methods struggle with limited labeled data for training effective RL models.

Purpose of the Study:

* To develop a semi-supervised learning (SSL) approach for reinforcement learning (RL) in dynamic treatment regimes.
* To leverage both small labeled datasets with observed outcomes and large unlabeled datasets with outcome surrogates.
* To address challenges in generalizing SSL to dynamic treatment regimes, including unknown feature distributions and non-informative surrogate variables.

Main Methods:

* Proposed a semi-supervised, efficient approach to Q-learning and doubly robust off-policy value estimation.
* Developed a modified SSL framework to handle outcome surrogates predictive but not policy-informative.
* Provided theoretical analysis of Q-function and value-function estimators to quantify SSL efficiency gains.

Main Results:

* The proposed SSL method demonstrates at least equivalent efficiency compared to purely supervised approaches.
* The method is robust to potential bias arising from mis-specified imputation models.
* Theoretical results quantify the efficiency gains achieved by incorporating unlabeled data via SSL.

Conclusions:

* SSL offers an effective strategy to enhance RL for dynamic treatment regimes using limited labeled clinical data.
* The proposed methods provide efficient and robust value estimation in challenging healthcare data scenarios.
* This work advances the application of machine learning in personalized medicine by improving data utilization.