Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Which is better: holdout or full-sample classifier design?

Marcel Brun¹, Qian Xu, Edward R Dougherty

¹Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA.

EURASIP Journal on Bioinformatics & Systems Biology

|May 17, 2008

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Accurate Genomic Predictions for Chronic Wasting Disease in North American Elk.

G3 (Bethesda, Md.)·2026

Same author

Late-life dietary folate restriction reduces biosynthesis without compromising healthspan in mice.

Life science alliance·2024

Same author

Pathway-based analyses of gene expression profiles at low doses of ionizing radiation.

Frontiers in bioinformatics·2024

Same author

Late-life dietary folate restriction reduces biosynthetic processes without compromising healthspan in mice.

bioRxiv : the preprint server for biology·2024

Same author

Optimal decision-making in high-throughput virtual screening pipelines.

Patterns (New York, N.Y.)·2023

Same author

Knowledge-driven learning, optimization, and experimental design under uncertainty for materials discovery.

Patterns (New York, N.Y.)·2023

Same journal

Learning directed acyclic graphs from large-scale genomics data.

EURASIP journal on bioinformatics & systems biology·2017

Same journal

Bayesian inference for biomarker discovery in proteomics: an analytic solution.

EURASIP journal on bioinformatics & systems biology·2017

Same journal

Review of stochastic hybrid systems with applications in biological systems modeling and analysis.

EURASIP journal on bioinformatics & systems biology·2017

Same journal

Using multi-step proposal distribution for improved MCMC convergence in Bayesian network structure learning.

EURASIP journal on bioinformatics & systems biology·2017

Same journal

On biometric systems: electrocardiogram Gaussianity and data synthesis.

EURASIP journal on bioinformatics & systems biology·2017

Same journal

Biomedical informatics with optimization and machine learning.

EURASIP journal on bioinformatics & systems biology·2017

See all related articles

Designing a classifier using the full sample dataset consistently yields better results than using a holdout subset for error estimation. Full-sample design offers a superior classifier with a smaller expected error bound.

Area of Science:

Machine Learning
Statistical Learning Theory
Data Science

Background:

Classifier design and error estimation are critical in machine learning.
The choice between using a full sample or a holdout test subset impacts both classifier performance and error estimation accuracy.

Purpose of the Study:

To compare the performance of full-sample classifier design versus holdout design for error estimation.
To determine which design strategy offers a smaller expected error bound under a conservative criterion.

Main Methods:

The study employed covariance models and a patient-data model for analysis.
A criterion based on achieving a classifier error below a given bound was used for comparison.
The expected bound was decomposed into expected true error and expected conditional standard deviation of true error.

Related Experiment Videos

Main Results:

Full-sample design consistently outperformed holdout design in terms of the expected error bound.
The analysis revealed a clear relationship between full-sample and holdout designs through error decomposition.

Conclusions:

Full-sample classifier design is superior to holdout design for achieving better classifiers and more reliable error bounds.
The findings provide a theoretical basis for preferring full-sample design in machine learning applications.