Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

172
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
172
Random Sampling Method01:09

Random Sampling Method

11.3K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
11.3K
Bootstrapping01:24

Bootstrapping

646
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
646
Randomized Experiments01:13

Randomized Experiments

7.1K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
7.1K
Probability in Statistics01:14

Probability in Statistics

13.8K
Probability is the likelihood of an event occurring. The term event is defined as a collection of results of a procedure. An event is a simple event when an outcome cannot be divided into simpler parts.
An example of a simple event is a coin toss. The result of a coin toss is either a head or a tail. Here, head and tail are two simple events. These two simple events make up the sample space. Further, the probability of an event occurring falls within the range of 0 to 1. The probability of an...
13.8K
Cluster Sampling Method01:20

Cluster Sampling Method

12.0K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
12.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Distinct Metabolomic and Lipidomic Profiles Across Donation after Circulatory Death Recovery Strategies Reveal a Common Signature Associated with Primary Graft Dysfunction.

The Journal of heart and lung transplantation : the official publication of the International Society for Heart Transplantation·2026
Same author

Predictors of Sustainability in the Collaborative Care Medicaid Program for Depression: A Cross-Sectional Study.

Psychiatric services (Washington, D.C.)·2026
Same author

FDA Draft Guidance for the Use of Bayesian Methods in Clinical Trials.

JAMA·2026
Same author

Pathfinder: Parallel quasi-Newton variational inference.

Journal of machine learning research : JMLR·2026
Same author

BIVARIATE HIERARCHICAL BAYESIAN MODEL FOR COMBINING SUMMARY MEASURES AND THEIR UNCERTAINTIES FROM MULTIPLE SOURCES.

The annals of applied statistics·2026
Same author

A BAYESIAN GROWTH MIXTURE MODEL FOR COMPLEX SURVEY DATA: CLUSTERING POSTDISASTER PTSD TRAJECTORIES.

The annals of applied statistics·2026
Same journal

Direct-Assisted Bayesian Unit-level Modeling for Small Area Estimation of Rare Event Prevalence.

Journal of survey statistics and methodology·2026
Same journal

Toward a Principled Workflow for Prevalence Mapping Using Household Survey Data.

Journal of survey statistics and methodology·2026
Same journal

MEETING DATA COLLECTION GOALS QUICKER: AN EXPERIMENTAL EVALUATION TO REDUCE FIELDWORK DURATION IN A MIXED-MODE PANEL STUDY.

Journal of survey statistics and methodology·2026
Same journal

COMPARATIVE EFFECTIVENESS OF PROPENSITY SCORE ESTIMATION METHODS FOR INVERSE PROBABILITY OF TREATMENT WEIGHTING ANALYSIS WITH COMPLEX SURVEY DATA: A SIMULATION STUDY.

Journal of survey statistics and methodology·2025
Same journal

Synthesizing Surveys with Multiple Units of Observation: An Application to the Longitudinal Aging Study in India.

Journal of survey statistics and methodology·2025
Same journal

Analyzing Potential Non-Ignorable Selection Bias in an Off-Wave Mail Survey Implemented in a Long-Standing Panel Study.

Journal of survey statistics and methodology·2025
See all related articles

Related Experiment Video

Updated: Aug 3, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K

Inference from Nonrandom Samples Using Bayesian Machine Learning.

Yutao Liu1, Andrew Gelman2, Qixuan Chen3

  • 1is a Senior Biostatistician II at Vertex Pharmaceuticals, Boston, USA and was a PhD student in the Department of Biostatistics at Columbia University, New York, NY, USA.

Journal of Survey Statistics and Methodology
|April 11, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces a regularized prediction method for valid inference from nonrandom samples using high-dimensional auxiliary data. The approach ensures accurate population mean estimation and reliable uncertainty quantification in data-rich settings.

Keywords:
Bayesian machine learningHigh-dimensional auxiliary variablesNonrandom samplesProbability and nonprobability surveysPropensity scoreSoft Bayesian additive regression trees

More Related Videos

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin
08:57

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

16.0K
Creating Objects and Object Categories for Studying Perception and Perceptual Learning
14:38

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

11.9K

Related Experiment Videos

Last Updated: Aug 3, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.6K
Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin
08:57

Using Phylogenetic Analysis to Investigate Eukaryotic Gene Origin

Published on: August 14, 2018

16.0K
Creating Objects and Object Categories for Studying Perception and Perceptual Learning
14:38

Creating Objects and Object Categories for Studying Perception and Perceptual Learning

Published on: November 2, 2012

11.9K

Area of Science:

  • Statistics
  • Machine Learning
  • Survey Methodology

Background:

  • Inference from nonrandom samples is challenging, especially with high-dimensional auxiliary data.
  • Traditional methods may struggle with complex data structures and potential biases.
  • Survey inference is a key area where nonrandom sampling is common.

Purpose of the Study:

  • To develop a robust statistical method for valid inference from nonrandom samples.
  • To leverage high-dimensional auxiliary information for improved population estimation.
  • To provide a framework for quantifying uncertainty in these settings.

Main Methods:

  • Proposed a regularized prediction approach using machine learning models.
  • Incorporated a large number of auxiliary variables to ensure reasonable ignorability.
  • Extended the method by including propensity scores as predictors in Bayesian additive regression trees.
  • Utilized a Bayesian framework for uncertainty quantification.

Main Results:

  • Simulation studies demonstrated valid inference for population means.
  • Achieved coverage rates close to nominal levels in simulations.
  • The regularized prediction approach using soft Bayesian additive regression trees proved effective.
  • Successfully applied the methods to real-world survey and epidemiologic data.

Conclusions:

  • The proposed regularized prediction method offers a powerful tool for inference from nonrandom samples.
  • Effective utilization of high-dimensional auxiliary data enhances the accuracy and reliability of population estimates.
  • The approach is applicable to diverse fields, including survey research and epidemiology.