Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

One-Way ANOVA: Unequal Sample Sizes

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Development of a signal quality evaluation of dynamic versus static <sup>18</sup>FDG-PET in focal epilepsy via Bayesian regional estimated signal quality analysis.

AJNR. American journal of neuroradiology·2026

Same author

From urge to behavior: An investigation of the temporal relationship between eating disorder urges and engagement in eating disorder behaviors.

Behaviour research and therapy·2026

Same author

Handling Missing Data in Longitudinal Rehabilitation Research: A Methodological Demonstration With Functional Trajectories of Older Adults With TBI.

The Journal of head trauma rehabilitation·2026

Same author

Mesocorticolimbic connectivity and motivational sensitivity: sex-specific effects of puberty in early adolescence.

Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology·2026

Same author

Ordinal Outcome State-Space Models for Intensive Longitudinal Data.

Psychometrika·2026

Same author

Does lumbar vertebra bone microstructure relate to combined loading fracture tolerance and inform fracture initiation site?

Bone·2026

Same journal

A joint model for a longitudinal outcome and a progressive multistate model under a mixed observation scheme.

Statistical methods in medical research·2026

Same journal

Efficient semi-supervised estimation of optimal individualized treatment regimes with survival outcome.

Statistical methods in medical research·2026

Same journal

Asymptotic online FWER control for dependent test statistics.

Statistical methods in medical research·2026

Same journal

Regression analysis of misclassified current status data with potentially unknown test accuracy.

Statistical methods in medical research·2026

Same journal

Bayesian multivariate linear mixed-effects models with varied association structures.

Statistical methods in medical research·2026

Same journal

Inference about the ratio of age-standardized rates between two overlapping populations.

Statistical methods in medical research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 19, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Fast leave-one-cluster-out cross-validation using clustered network information criterion.

Jiaxing Qiu^1,2, Douglas E Lake², Pavel Chernyavskiy²

¹School of Data Science, School of Medicine, University of Virginia, Charlottesville, VA, USA.

Statistical Methods in Medical Research

|June 19, 2025

Summary

This summary is machine-generated.

A new clustered estimator of the network information criterion (CNIC) accurately assesses prediction model generalizability for clustered data. CNIC is a faster, more reliable alternative to cluster-based cross-validation, especially with strong clustering.

Keywords:

Fisher information matrix Predictive modeling cluster-based cross-validation clustered data network information criterion

More Related Videos

Modeling the Functional Network for Spatial Navigation in the Human Brain

Modeling the Functional Network for Spatial Navigation in the Human Brain

Published on: October 13, 2023

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

Published on: October 19, 2021

Related Experiment Videos

Last Updated: Sep 19, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Modeling the Functional Network for Spatial Navigation in the Human Brain

Modeling the Functional Network for Spatial Navigation in the Human Brain

Published on: October 13, 2023

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

JUMPn: A Streamlined Application for Protein Co-Expression Clustering and Network Analysis in Proteomics

Published on: October 19, 2021

Area of Science:

Statistics
Machine Learning
Biostatistics

Background:

Prediction models on clustered data require cluster-based validation for generalizability.
Existing methods like Akaike information criterion (AIC) and Bayesian information criterion (BIC) may not adequately address cluster heterogeneity.
Leave-one-cluster-out cross-validation is a robust but computationally intensive validation technique.

Purpose of the Study:

Introduce a clustered estimator of the network information criterion (CNIC) as a fast approximation to leave-one-cluster-out deviance.
Develop a method to assess model generalizability for prediction models with clustered data.
Provide a more accurate model selection criterion for clustered data compared to AIC and BIC.

Main Methods:

Derived a clustered network information criterion by modifying the standard network information criterion with a clustering-adjusted Fisher information matrix.
Applied the CNIC to standard regression models with Gaussian or binomial responses for clustered data.
Evaluated CNIC performance using simulation studies and an empirical example, comparing it to cluster-based cross-validation, AIC, and BIC.

Main Results:

The clustered network information criterion (CNIC) provides a more accurate approximation to leave-one-cluster-out deviance than AIC and BIC.
CNIC results in more accurate model size and variable selection, particularly when data exhibit strong clustering.
CNIC imposes a greater penalty for stronger clustering, effectively preventing over-parameterization.

Conclusions:

CNIC is a computationally efficient and accurate tool for model selection and validation in prediction models with clustered data.
CNIC offers superior performance over traditional criteria like AIC and BIC when dealing with cluster heterogeneity.
The proposed method enhances the reliability of prediction models developed on clustered datasets.