Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Test for Homogeneity

Test for Homogeneity

The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Meal-Related Symptoms Define Distinct and Clinically Significant Phenotypes in Children With Pain-Related Disorders of Gut-Brain Interaction.

Neurogastroenterology and motility·2026

Same author

Birth-size reference charts for newborns admitted to the neonatal intensive care units.

BMC pediatrics·2026

Same author

Racial and ethnic disparities in postnatal growth of infants born before 30 weeks of gestation.

Journal of perinatology : official journal of the California Perinatal Association·2026

Same author

Social well-being moderates behavioral therapy response for generalized anxiety disorder.

Journal of mood and anxiety disorders·2025

Same author

Prepped and Ready: A Digital Intervention for Caregivers to Reduce Suicide Risk in Youth.

Clinical child psychology and psychiatry·2025

Same author

Ultra-Low-Field Portable Magnetic Resonance Imaging Feasibility and Safety in Pediatric and Neonatal Extracorporeal Membrane Oxygenation: A Single-Center Year-Long Experience.

Journal of the American Heart Association·2025

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 30, 2025

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Consensus features nested cross-validation.

Saeid Parvandeh^1,2, Hung-Wen Yeh³, Martin P Paulus⁴

¹Tandy School of Computer Science, University of Tulsa, Tulsa, OK, USA.

Bioinformatics (Oxford, England)

|January 28, 2020

Summary

This summary is machine-generated.

Consensus nested cross-validation (cnCV) improves machine learning by selecting stable features efficiently. This new method offers comparable accuracy to existing approaches but with faster run times and fewer false positives.

More Related Videos

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Related Experiment Videos

Last Updated: Dec 30, 2025

An R-Based Landscape Validation of a Competing Risk Model

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Cross-Modal Multivariate Pattern Analysis

Cross-Modal Multivariate Pattern Analysis

Published on: November 9, 2011

Area of Science:

Machine learning
Computational biology
Bioinformatics

Background:

Feature selection is crucial for machine learning model accuracy but risks overfitting.
Nested cross-validation (nCV) and differential privacy are existing methods to mitigate overfitting.
nCV selects features based on inner-fold accuracy, while differential privacy uses noise for feature stability.

Purpose of the Study:

Introduce consensus nested cross-validation (cnCV), a novel method combining feature stability and nCV.
Evaluate cnCV's performance against standard nCV, Elastic Net, differential privacy, and private evaporative cooling (pEC).
Assess cnCV using simulated data and real RNA-seq data for major depressive disorder.

Main Methods:

Developed cnCV, using feature consensus across inner folds for stability instead of accuracy.
Compared cnCV with nCV, Elastic Net, differential privacy, and pEC on simulated and real datasets.
Analyzed classification accuracy, feature selection performance, and computational efficiency.

Main Results:

cnCV achieved similar training and validation accuracy to nCV but with significantly reduced run times.
cnCV identified a more parsimonious set of features with fewer false positives compared to nCV.
cnCV demonstrated comparable accuracy to pEC and effectively selected stable features without requiring a privacy threshold.

Conclusions:

cnCV is an effective and efficient approach for integrating feature selection with classification.
The method offers a balance of accuracy, efficiency, and parsimonious feature selection.
cnCV provides a robust alternative for feature selection in machine learning, particularly in bioinformatics.