Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Prediction error estimation: a comparison of resampling methods.

Annette M Molinaro1, Richard Simon, Ruth M Pfeiffer

  • 1Biostatistics Branch, Division of Cancer Epidemiology and Genetics, NCI, NIH, Rockville, MD 20852, USA. annette.molinaro@yale.edu

Bioinformatics (Oxford, England)
|May 21, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Spatially identifying regions of tumor recurrence in patients with suspected recurrent glioma using physiologic MRI and machine learning.

NPJ digital medicine·2026
Same author

2D Ultrasound Elasticity Imaging of Abdominal Aortic Aneurysms Using Deep Neural Networks.

IEEE transactions on computational imaging·2026
Same author

MRI Deep Learning for Differentiating Glioblastoma, IDH Wild-type from Central Nervous System Diffuse Large B-cell Lymphoma.

Cancer research communications·2026
Same author

Toward Patient-Specific Partial Point Cloud to Surface Completion for Pre to Intra-operative Registration in Image-Guided Liver Interventions.

Medical Image Understanding and Analysis. Medical Image Understanding and Analysis (Conference)·2026
Same author

Evaluation of Intra-operative Patient-specific Methods for Point Cloud Completion for Minimally Invasive Liver Interventions.

Proceedings of SPIE--the International Society for Optical Engineering·2026
Same author

Investigating the Domain Adaptability of General-Purpose Foundation Models for Left Atrium Segmentation from MR Images.

Functional imaging and modeling of the heart : ... International Workshop, FIMH ..., proceedings. FIMH (Conference)·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Estimating prediction error in genomic studies with feature selection is challenging. Leave-one-out cross-validation (LOOCV) and 10-fold cross-validation (CV) offer the least biased estimates for small sample sizes.

Area of Science:

  • Genomics
  • Biostatistics
  • Machine Learning

Background:

  • Genomic studies generate thousands of features from limited samples.
  • Classifiers are built to predict outcomes from these features.
  • Accurate prediction error estimation is crucial, especially with feature selection.

Purpose of the Study:

  • To compare methods for estimating prediction error in genomic studies.
  • To evaluate the bias of different resampling techniques in the presence of feature selection.
  • To identify optimal methods for prediction assessment in small sample sizes.

Main Methods:

  • Comparison of prediction error estimation methods.
  • Evaluation of resubstitution, split-sample, leave-one-out cross-validation (LOOCV), k-fold cross-validation (CV), and .632+ bootstrap.

Related Experiment Videos

  • Analysis of bias and mean square error across different resampling techniques.
  • Main Results:

    • Resubstitution and simple split-sample estimates are biased in small genomic studies.
    • LOOCV, 10-fold CV, and .632+ bootstrap show the smallest bias for certain models.
    • LOOCV, 5- and 10-fold CV, and .632+ bootstrap yield the lowest mean square error.
    • The .632+ bootstrap is biased in small samples with high signal-to-noise ratios.
    • Method performance differences decrease with increasing sample size.

    Conclusions:

    • LOOCV and k-fold CV are recommended for accurate prediction error estimation in small genomic studies.
    • The choice of method impacts the reliability of prediction error estimates.
    • Increasing sample size reduces the differences in performance among resampling methods.