Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Randomized Experiments01:13

Randomized Experiments

8.1K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.1K
Bandpass Sampling01:17

Bandpass Sampling

280
In signal processing, bandpass sampling is an effective technique for sampling signals that have most of their energy concentrated within a narrow frequency band. This type of signal is known as a bandpass signal. The key principle of bandpass sampling involves sampling the signal at a rate that is greater than twice the signal's bandwidth to prevent aliasing.
A bandpass signal has a spectrum with a lower frequency limit, denoted as ω1, and an upper frequency limit, denoted as ω2....
280
Decision Making: P-value Method01:09

Decision Making: P-value Method

5.8K
The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim  is also stated. These statements can act as null and alternative hypotheses:  a null hypothesis would be a neutral statement while the alternative hypothesis can...
5.8K
The Anchoring-and-Adjustment Heuristic01:25

The Anchoring-and-Adjustment Heuristic

7.5K
In order to make good decisions, we use our knowledge and our reasoning. Often, this knowledge and reasoning is sound and solid. However, sometimes, we are swayed by biases or by others manipulating a situation. For example, let’s say you and three friends wanted to rent a house and had a combined target budget of $1,600. The realtor shows you only very run-down houses for $1,600 and then shows you a very nice house for $2,000. Might you ask each person to pay more in rent to get the...
7.5K
Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving01:29

Mechanistic Models: Compartment Models in Algorithms for Numerical Problem Solving

115
Mechanistic models play a crucial role in algorithms for numerical problem-solving, particularly in nonlinear mixed effects modeling (NMEM). These models aim to minimize specific objective functions by evaluating various parameter estimates, leading to the development of systematic algorithms. In some cases, linearization techniques approximate the model using linear equations.
In individual population analyses, different algorithms are employed, such as Cauchy's method, which uses a...
115
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

786
This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
786

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Mobile intervention for emerging adults with regular cannabis use: a micro-randomized trial.

Lancet regional health. Americas·2026
Same author

Is More Always Better With Digital Health Interventions? Shifting Engagement From Maximizing Use to Supporting Health.

Mayo Clinic proceedings. Digital health·2026
Same author

Effective monitoring of online AI decision-making algorithms in just-in-time adaptive interventions.

NPJ digital medicine·2026
Same author

Design and Rationale of the My Heart Counts Cardiovascular Health Study: a Large-Scale, Fully Digital Biobank, and Randomized Trial of Large Language Model-Driven Coaching of Physical Activity.

medRxiv : the preprint server for health sciences·2026
Same author

SigmaScheduling: Uncertainty-Informed Scheduling of Decision Points for Intelligent Mobile Health Interventions.

... International Conference on Wearable and Implantable Body Sensor Networks. International Conference on Wearable and Implantable Body Sensor Networks·2026
Same author

Non-Stationary Latent Auto-Regressive Bandits.

Reinforcement learning journal·2026
Same journal

Distributionally Robust Feature Selection.

Advances in neural information processing systems·2026
Same journal

On the Identifiability of Hybrid Deep Generative Models: Meta-Learning as a Solution.

Advances in neural information processing systems·2026
Same journal

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time.

Advances in neural information processing systems·2026
Same journal

JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics.

Advances in neural information processing systems·2026
Same journal

Learning to Route: Per-Sample Adaptive Routing for Multimodal Multitask Prediction.

Advances in neural information processing systems·2026
Same journal

Emergence and Evolution of Interpretable Concepts in Diffusion Models.

Advances in neural information processing systems·2026
See all related articles

Related Experiment Video

Updated: Oct 7, 2025

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents
07:05

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

6.1K

Inference for Batched Bandits.

Kelly W Zhang1, Lucas Janson2, Susan A Murphy3

  • 1Department of Computer Science, Harvard University.

Advances in Neural Information Processing Systems
|January 10, 2022
PubMed
Summary
This summary is machine-generated.

This study addresses the challenge of drawing accurate statistical conclusions from data collected through adaptive bandit algorithms, which are widely used in science and industry. The authors demonstrate that standard statistical tools, like ordinary least squares, fail to provide reliable results when data is collected adaptively without a unique optimal choice. To solve this, they introduce a new estimation method that ensures valid statistical confidence and error control.

Keywords:
Adaptive SamplingLinear RegressionConfidence IntervalsDecision Theory

Frequently Asked Questions

More Related Videos

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods
13:04

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

12.2K
Barnes Maze Testing Strategies with Small and Large Rodent Models
12:59

Barnes Maze Testing Strategies with Small and Large Rodent Models

Published on: February 26, 2014

42.5K

Related Experiment Videos

Last Updated: Oct 7, 2025

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents
07:05

Operant Protocols for Assessing the Cost-benefit Analysis During Reinforced Decision Making by Rodents

Published on: September 10, 2018

6.1K
Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods
13:04

Measuring the Subjective Value of Risky and Ambiguous Options using Experimental Economics and Functional MRI Methods

Published on: September 19, 2012

12.2K
Barnes Maze Testing Strategies with Small and Large Rodent Models
12:59

Barnes Maze Testing Strategies with Small and Large Rodent Models

Published on: February 26, 2014

42.5K

Area of Science:

  • Statistical inference within Batched Bandits research
  • Computational learning theory and decision science

Background:

Adaptive decision-making systems often generate data that challenges traditional statistical assumptions. No prior work had resolved how standard estimation techniques behave when applied to sequentially collected observations. Researchers frequently rely on simple linear models for decision analysis. That uncertainty drove concerns regarding the validity of resulting error estimates. It was already known that standard methods assume independent and identically distributed samples. However, bandit algorithms inherently violate these independence requirements during the collection process. This gap motivated a closer examination of how adaptive sampling affects estimator distributions. The current literature lacks robust frameworks for handling batch-collected information in these settings.

Purpose Of The Study:

The aim of this work is to develop reliable inference methods for data collected using bandit algorithms. This study addresses the increasing need for statistical validity in industrial and scientific applications. Researchers investigate the limitations of standard estimation techniques when applied to adaptively collected information. The authors seek to resolve the problem of asymptotic non-normality in linear estimators. This motivation stems from the observation that naive assumptions lead to inflated error rates. The team intends to provide a robust framework that functions across multi-arm and contextual bandit settings. They focus on creating a method that remains effective despite non-stationarity in baseline rewards. This effort provides a foundation for more accurate decision analysis in sequential environments.

Main Methods:

The review approach involves a formal mathematical analysis of estimator behavior under adaptive sampling protocols. Investigators evaluate the asymptotic properties of the ordinary least squares estimator within sequential decision frameworks. The team constructs a new estimation procedure designed specifically for batch-collected information. This design process focuses on ensuring normality across multi-arm and contextual bandit configurations. Researchers compare the performance of their proposed method against traditional linear regression benchmarks. The study utilizes theoretical proofs to establish the convergence characteristics of the new estimator. This approach systematically addresses the failure of classical assumptions in adaptive settings. The authors validate their findings by demonstrating robustness against non-stationary reward signals.

Main Results:

Key findings from the literature indicate that the ordinary least squares estimator is not asymptotically normal when no unique optimal arm exists. This failure leads to significant Type-1 error inflation and unreliable confidence intervals. The authors demonstrate that the Batched OLS estimator achieves asymptotic normality for both multi-arm and contextual bandit data. This result holds even when the baseline reward experiences non-stationarity. The analysis confirms that the proposed method provides better coverage probabilities than standard approaches. The study proves that the new estimator remains stable across diverse adaptive sampling environments. These findings quantify the risks associated with naive statistical assumptions in sequential decision-making. The results establish a formal framework for reliable inference in complex bandit applications.

Conclusions:

The authors demonstrate that standard linear estimators fail to maintain normal distributions under adaptive sampling conditions. This synthesis implies that naive statistical approaches often produce misleading confidence intervals in bandit settings. The researchers propose the Batched OLS estimator as a reliable alternative for multi-arm environments. This new method maintains asymptotic normality even when the underlying reward structures change over time. The findings suggest that practitioners should adopt these adjusted estimators to avoid inflated error rates. The work provides a formal basis for ensuring statistical validity in adaptive decision systems. These results clarify the limitations of applying classical regression techniques to sequential data streams. The study confirms that robust inference is achievable through specifically designed batch-based estimation procedures.

The researchers propose the Batched OLS estimator, which maintains asymptotic normality. In contrast, the standard ordinary least squares estimator exhibits asymptotic non-normality when no unique optimal arm exists, leading to inaccurate confidence intervals.

The authors utilize the Batched OLS estimator to handle data collected from multi-arm and contextual bandits. This tool specifically addresses the non-normality issues encountered when using traditional linear regression on adaptively sampled information.

The authors prove that the standard ordinary least squares estimator is not asymptotically normal when there is no unique optimal arm. This technical necessity arises because adaptive bandit algorithms violate the independence assumptions required for classical normality.

The researchers use batched data to ensure statistical validity. This component plays a role in stabilizing the estimation process, allowing the Batched OLS estimator to remain robust even when baseline rewards exhibit non-stationarity.

The authors measure the asymptotic normality of estimators. They observe that while the standard approach suffers from Type-1 error inflation, the Batched OLS method provides better coverage probabilities in multi-arm and contextual bandit scenarios.

The researchers claim that their new estimator is robust to non-stationarity in baseline rewards. They imply that this property makes the method suitable for real-world applications where reward distributions may shift over time.