Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Decision Making: P-value Method01:09

Decision Making: P-value Method

6.8K
The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim  is also stated. These statements can act as null and alternative hypotheses:  a null hypothesis would be a neutral statement while the alternative hypothesis can...
6.8K
Randomized Experiments01:13

Randomized Experiments

8.8K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.8K
Sampling Plans01:23

Sampling Plans

861
Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
861
Convenience Sampling Method00:55

Convenience Sampling Method

10.8K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population.
Convenience sampling is a non-random method of sample selection; this method selects individuals that are easily accessible and may result in biased data. For example, a marketing...
10.8K
Random Sampling Method01:09

Random Sampling Method

14.0K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
14.0K
Assumptions of Survival Analysis01:15

Assumptions of Survival Analysis

382
Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.
382

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Mind the Performance Gap: Examining Dataset Shift During Prospective Validation.

Proceedings of machine learning research·2026
Same author

Off by a beat: the effects of temporal misalignment in reinforcement learning for sepsis treatment.

NPJ digital medicine·2026
Same author

Clinical trials for continuously monitored and updated AI systems.

Nature medicine·2026
Same author

Is AI actually improving healthcare?

Nature medicine·2026
Same author

Automatic multi-IMU-based deep learning evaluation of intensity during static standing balance training exercises.

Journal of neuroengineering and rehabilitation·2025
Same author

AI, Health, and Health Care Today and Tomorrow: The JAMA Summit Report on Artificial Intelligence.

JAMA·2025
Same journal

Distributionally Robust Feature Selection.

Advances in neural information processing systems·2026
Same journal

On the Identifiability of Hybrid Deep Generative Models: Meta-Learning as a Solution.

Advances in neural information processing systems·2026
Same journal

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time.

Advances in neural information processing systems·2026
Same journal

JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics.

Advances in neural information processing systems·2026
Same journal

Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction.

Advances in neural information processing systems·2026
Same journal

Emergence and Evolution of Interpretable Concepts in Diffusion Models.

Advances in neural information processing systems·2026
See all related articles

Related Experiment Video

Updated: Jan 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

986

Counterfactual-Augmented Importance Sampling for Semi-Offline Policy Evaluation.

Shengpu Tang1, Jenna Wiens1

  • 1Computer Science & Engineering, University of Michigan, Ann Arbor, MI, USA.

Advances in Neural Information Processing Systems
|December 15, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a semi-offline evaluation framework for reinforcement learning (RL) in high-stakes domains. It uses human annotations to improve policy evaluation, overcoming limitations of purely offline or unsafe online methods.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.2K

Related Experiment Videos

Last Updated: Jan 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

986
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.2K

Area of Science:

  • Artificial Intelligence
  • Machine Learning
  • Computational Statistics

Background:

  • Off-policy evaluation (OPE) using observational data is limited by distribution shifts.
  • Online evaluation is often infeasible in high-stakes domains due to safety concerns.

Purpose of the Study:

  • To propose a semi-offline evaluation framework for reinforcement learning (RL).
  • To incorporate human annotations of counterfactual trajectories to improve OPE.
  • To develop novel OPE estimators that mitigate bias and variance.

Main Methods:

  • Developed a semi-offline evaluation framework combining offline data with human annotations.
  • Designed a new family of OPE estimators using importance sampling (IS) and a novel weighting scheme.
  • Analyzed theoretical properties and conducted proof-of-concept experiments.

Main Results:

  • The proposed method incorporates counterfactual annotations without introducing bias.
  • The approach demonstrated potential for reducing both bias and variance compared to standard IS estimators.
  • Experiments showed superior performance over purely offline IS estimators, even with imperfect annotations.

Conclusions:

  • The semi-offline framework enables safer and more reliable RL policy evaluation in critical applications.
  • Human-centered annotation design is crucial for effective implementation.
  • This work facilitates RL adoption in high-stakes domains by addressing evaluation challenges.