Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

Randomized Experiments

Randomized Experiments

The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...

Random Sampling Method

Random Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...

Stratified Sampling Method

Stratified Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Proportional Hazards Regression for Interval-Censored Outcomes With an Interval-Censored Covariate.

Statistics in medicine·2026

Same author

Handling incomplete outcomes and covariates in cluster-randomized trials: doubly robust estimation, efficiency considerations, and sensitivity analysis.

Biometrics·2026

Same author

Network generalized estimating equations for complexly correlated data with applications to cluster randomized trials.

Biostatistics (Oxford, England)·2025

Same author

Impact of Unconscious Race Bias Among Anesthesia Providers on Nonverbal Communication During the Preoperative Anesthesia Consult: A Prospective, Observational Study.

Journal of racial and ethnic health disparities·2025

Same author

Permutation tests for detecting treatment effect heterogeneity in cluster randomized trials.

Statistical methods in medical research·2025

Same author

Estimating marginal treatment effect in cluster randomized trials with multi-level missing outcomes.

Biometrics·2024

Same journal

Semiparametric regression methods for temporal processes subject to multiple sources of censoring.

The Canadian journal of statistics = Revue canadienne de statistique·2026

Same journal

Robust causal inference for point exposures with missing confounders.

The Canadian journal of statistics = Revue canadienne de statistique·2025

Same journal

Debiased lasso after sample splitting for estimation and inference in high-dimensional generalized linear models.

The Canadian journal of statistics = Revue canadienne de statistique·2025

Same journal

Robust Estimation of Loss-Based Measures of Model Performance under Covariate Shift.

The Canadian journal of statistics = Revue canadienne de statistique·2024

Same journal

Optimal multiwave validation of secondary use data with outcome and exposure misclassification.

The Canadian journal of statistics = Revue canadienne de statistique·2024

Same journal

Smoothed model-assisted small area estimation of proportions.

The Canadian journal of statistics = Revue canadienne de statistique·2024

See all related articles

Search research articles

Related Experiment Video

Updated: May 24, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Variable selection in modelling clustered data via within-cluster resampling.

Shangyuan Ye¹, Tingting Yu², Daniel A Caroff³

¹Biostatistics Shared Resource, Knight Cancer Institute, Oregon Health & Science University, Oregon, U.S.A.

The Canadian Journal of Statistics = Revue Canadienne De Statistique

|March 5, 2025

Summary

This summary is machine-generated.

This study introduces a novel variable selection method for high-dimensional clustered data, crucial for building accurate biomedical risk-adjustment models. The approach effectively identifies important risk factors and interactions in complex datasets.

Keywords:

Clustered data stability selection variable selection within-cluster resampling

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Related Experiment Videos

Last Updated: May 24, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Area of Science:

Biostatistics
Health Services Research
Data Science

Background:

Risk-adjustment models are essential in biomedical applications but face challenges with clustered, high-dimensional data.
Existing variable selection methods are inadequate for discrete clustered data with numerous variables and large clusters.

Purpose of the Study:

To develop and evaluate a new variable selection approach for high-dimensional clustered data.
To address the lack of suitable methods for selecting variables in complex biomedical datasets.

Main Methods:

A novel approach combining within-cluster resampling with penalized likelihood methods was developed.
Theoretical properties, including an upper bound on false selections, were derived.
Extensive simulations were used to assess finite sample performance.

Main Results:

The proposed method demonstrates oracle properties, indicating effective variable selection.
Simulations confirmed the method's performance in practical scenarios.
The approach was successfully applied to a large colon surgical site infection dataset.

Conclusions:

The new variable selection technique is effective for high-dimensional clustered data in biomedical research.
This method enhances the development of accurate risk-adjustment models by accounting for complex data structures and interactions.