Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Adapting prediction error estimates for biased complexity selection in high-dimensional bootstrap samples.

Harald Binder1, Martin Schumacher

  • 1Institute of Medical Biometry and Medical Informatics, University Medical Center Freiburg. binderh@fdm.uni-freiburg.de

Statistical Applications in Genetics and Molecular Biology
|April 4, 2008
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Systematic Evaluation of Plasma and Urine Metabolites to Predict the Risk of Adverse Kidney-related Outcomes in Chronic Kidney Disease: The GCKD Study∗.

Kidney medicine·2026
Same author

Ensuring Quality in Preclinical Research: The Importance of Being Human.

Biometrical journal. Biometrische Zeitschrift·2026
Same author

TACR3 variant confers resilience to aging and Alzheimer's disease.

medRxiv : the preprint server for health sciences·2026
Same author

mmContext: an open framework for multimodal contrastive learning of omics and text data.

Bioinformatics (Oxford, England)·2026
Same author

Resting-state brain activity and association with physical activity.

Frontiers in aging neuroscience·2026
Same author

[Evidence generation and methodological consultation by the project "EVAluation research based on data from routine clinical care 4 the Medical Informatics Initiative" (EVA4MII)].

Bundesgesundheitsblatt, Gesundheitsforschung, Gesundheitsschutz·2026
Same journal

Annealed variational mixtures for disease subtyping and biomarker discovery.

Statistical applications in genetics and molecular biology·2026
Same journal

Performance of the permutation test approach with base calling errors for detecting changes in variant allele frequencies in ctDNA for a single patient.

Statistical applications in genetics and molecular biology·2026
Same journal

BLOG: Bayesian longitudinal omics with group constraints.

Statistical applications in genetics and molecular biology·2026
Same journal

AI-driven risk prediction and categorization in cystic fibrosis leveraging AttentiveLSTM and Fox Wolf Optimizer.

Statistical applications in genetics and molecular biology·2026
Same journal

Perfect collinearity not created equal: measuring and visualizing the severity of multi-collinearity of modern omics data.

Statistical applications in genetics and molecular biology·2026
Same journal

Corrigendum to: Choice of baseline hazards in joint modeling of longitudinal and time-to-event cancer survival data.

Statistical applications in genetics and molecular biology·2025
See all related articles

The bootstrap method for evaluating prediction performance can be biased with high-dimensional data. Sampling without replacement offers a promising solution to improve prediction error estimates in such scenarios.

Area of Science:

  • Statistics
  • Bioinformatics
  • Machine Learning

Background:

  • The bootstrap is a resampling technique for evaluating prediction performance without data splitting.
  • High-dimensional data, common in microarrays, presents challenges due to limited observations.
  • Accurate prediction performance evaluation is crucial for reliable statistical modeling.

Purpose of the Study:

  • To investigate the bias in complexity selection within conventional bootstrap samples for high-dimensional data.
  • To evaluate remedies for complexity selection bias, focusing on sampling without replacement.
  • To assess prediction error estimates for high-dimensional binary and time-to-event data.

Main Methods:

  • Simulation studies to assess bias in complexity selection using gradient boosting algorithms.

Related Experiment Videos

  • Comparison of conventional bootstrap (sampling with replacement) against sampling without replacement.
  • Application of a modified bootstrap procedure (.632+) to microarray data for performance evaluation.
  • Main Results:

    • Conventional bootstrap complexity selection is severely biased in many high-dimensional scenarios.
    • Bias in prediction error estimates often leads to underestimation of extractable information.
    • Sampling without replacement effectively mitigates complexity selection bias in various settings.

    Conclusions:

    • Complexity selection bias in bootstrap significantly impacts prediction error estimation for high-dimensional data.
    • Sampling without replacement is a viable and effective strategy to correct this bias.
    • The modified bootstrap procedure demonstrates utility in real-world high-dimensional data analysis, such as in cancer genomics.