Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

Convenience Sampling Method

Convenience Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population.
Convenience sampling is a non-random method of sample selection; this method selects individuals that are easily accessible and may result in biased data. For example, a marketing...

Random Sampling Method

Random Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...

Assumptions of Survival Analysis

Assumptions of Survival Analysis

Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Mind the Performance Gap: Examining Dataset Shift During Prospective Validation.

Proceedings of machine learning research·2026

Same author

Off by a beat: the effects of temporal misalignment in reinforcement learning for sepsis treatment.

NPJ digital medicine·2026

Same author

Clinical trials for continuously monitored and updated AI systems.

Nature medicine·2026

Same author

Is AI actually improving healthcare?

Nature medicine·2026

Same author

Automatic multi-IMU-based deep learning evaluation of intensity during static standing balance training exercises.

Journal of neuroengineering and rehabilitation·2025

Same author

AI, Health, and Health Care Today and Tomorrow: The JAMA Summit Report on Artificial Intelligence.

JAMA·2025

Same journal

Distributionally Robust Feature Selection.

Advances in neural information processing systems·2026

Same journal

On the Identifiability of Hybrid Deep Generative Models: Meta-Learning as a Solution.

Advances in neural information processing systems·2026

Same journal

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time.

Advances in neural information processing systems·2026

Same journal

JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics.

Advances in neural information processing systems·2026

Same journal

Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction.

Advances in neural information processing systems·2026

Same journal

Emergence and Evolution of Interpretable Concepts in Diffusion Models.

Advances in neural information processing systems·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Counterfactual-Augmented Importance Sampling for Semi-Offline Policy Evaluation.

Shengpu Tang¹, Jenna Wiens¹

¹Computer Science & Engineering, University of Michigan, Ann Arbor, MI, USA.

Advances in Neural Information Processing Systems

|December 15, 2025

Summary

This summary is machine-generated.

This study introduces a semi-offline evaluation framework for reinforcement learning (RL) in high-stakes domains. It uses human annotations to improve policy evaluation, overcoming limitations of purely offline or unsafe online methods.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Related Experiment Videos

Last Updated: Jan 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Area of Science:

Artificial Intelligence
Machine Learning
Computational Statistics

Background:

Off-policy evaluation (OPE) using observational data is limited by distribution shifts.
Online evaluation is often infeasible in high-stakes domains due to safety concerns.

Purpose of the Study:

To propose a semi-offline evaluation framework for reinforcement learning (RL).
To incorporate human annotations of counterfactual trajectories to improve OPE.
To develop novel OPE estimators that mitigate bias and variance.

Main Methods:

Developed a semi-offline evaluation framework combining offline data with human annotations.
Designed a new family of OPE estimators using importance sampling (IS) and a novel weighting scheme.
Analyzed theoretical properties and conducted proof-of-concept experiments.

Main Results:

The proposed method incorporates counterfactual annotations without introducing bias.
The approach demonstrated potential for reducing both bias and variance compared to standard IS estimators.
Experiments showed superior performance over purely offline IS estimators, even with imperfect annotations.

Conclusions:

The semi-offline framework enables safer and more reliable RL policy evaluation in critical applications.
Human-centered annotation design is crucial for effective implementation.
This work facilitates RL adoption in high-stakes domains by addressing evaluation challenges.