Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Is cross-validation valid for small-sample microarray classification?

Ulisses M Braga-Neto¹, Edward R Dougherty

¹Section of Clinical Cancer Genetics, University of Texas MD Anderson Cancer Center, Houston, TX, USA.

Bioinformatics (Oxford, England)

|February 13, 2004

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Pathway-based analyses of gene expression profiles at low doses of ionizing radiation.

Frontiers in bioinformatics·2024

Same author

Optimal decision-making in high-throughput virtual screening pipelines.

Patterns (New York, N.Y.)·2023

Same author

Knowledge-driven learning, optimization, and experimental design under uncertainty for materials discovery.

Patterns (New York, N.Y.)·2023

Same author

Sensitivity Analysis of Genome-Scale Metabolic Flux Prediction.

Journal of computational biology : a journal of computational molecular cell biology·2023

Same author

Short-chain fatty acid production in accessible and inaccessible body pools as assessed by novel stable tracer pulse approach is reduced by aging independent of presence of COPD.

Metabolism: clinical and experimental·2023

Same author

Probabilistic boolean networks predict transcription factor targets to induce transdifferentiation.

iScience·2022

Same journal

3DICE: Interpretable 3D Cross-Modal Learning for Drug-Target Interaction Prediction and Large-Scale Drug Discovery.

Bioinformatics (Oxford, England)·2026

Same journal

KASSPer: Kinase Active Site Structure Prediction using Protein and Ligand Language Models and Its Application to Virtual Screening.

Bioinformatics (Oxford, England)·2026

Same journal

IDR searcher: a search engine solution for public image resources.

Bioinformatics (Oxford, England)·2026

Same journal

KCFtools: Rapid alignment-free method for introgression screening and GWAS using k-mer profiles.

Bioinformatics (Oxford, England)·2026

Same journal

Meta2DB: Curated shotgun metagenomic feature sets and metadata for health state prediction.

Bioinformatics (Oxford, England)·2026

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

See all related articles

For small sample sizes in microarray classification, cross-validation error estimation is less biased but highly variable. Bootstrap methods offer better precision but can increase bias and computational cost.

Area of Science:

Bioinformatics
Computational Biology
Statistical Learning

Background:

Microarray classification often uses small sample sizes for classifier design and error estimation.
Cross-validation is the predominant method for error estimation in these studies.
Understanding cross-validation's behavior with limited data is crucial.

Purpose of the Study:

To compare the performance of cross-validation, resubstitution, and bootstrap error estimation methods.
To evaluate these methods under conditions of very small sample sizes.
To assess bias and precision of different error estimation techniques in microarray classification.

Main Methods:

An extensive simulation study was conducted.
Compared linear discriminant analysis, 3-nearest-neighbor, and decision trees (CART).

Related Experiment Videos

Utilized synthetic and real breast-cancer patient data.

Main Results:

Cross-validation shows lower bias than resubstitution but exhibits high variance, leading to unreliable estimates for small samples.
Bootstrap methods improve variance but can increase bias and computational expense.
Resubstitution demonstrates the highest bias among the methods.

Conclusions:

Cross-validation's high variance limits its reliability for small sample sizes in microarray classification.
Bootstrap methods present a trade-off between variance reduction and potential bias increase.
Careful consideration of error estimation methods is necessary when dealing with limited data in bioinformatics.