Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Which is better: holdout or full-sample classifier design?

Marcel Brun1, Qian Xu, Edward R Dougherty

  • 1Computational Biology Division, Translational Genomics Research Institute, Phoenix, AZ 85004, USA.

EURASIP Journal on Bioinformatics & Systems Biology
|May 17, 2008
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Accurate Genomic Predictions for Chronic Wasting Disease in North American Elk.

G3 (Bethesda, Md.)·2026
Same author

Late-life dietary folate restriction reduces biosynthesis without compromising healthspan in mice.

Life science alliance·2024
Same author

Pathway-based analyses of gene expression profiles at low doses of ionizing radiation.

Frontiers in bioinformatics·2024
Same author

Late-life dietary folate restriction reduces biosynthetic processes without compromising healthspan in mice.

bioRxiv : the preprint server for biology·2024
Same author

Optimal decision-making in high-throughput virtual screening pipelines.

Patterns (New York, N.Y.)·2023
Same author

Knowledge-driven learning, optimization, and experimental design under uncertainty for materials discovery.

Patterns (New York, N.Y.)·2023
Same journal

Learning directed acyclic graphs from large-scale genomics data.

EURASIP journal on bioinformatics & systems biology·2017
Same journal

Bayesian inference for biomarker discovery in proteomics: an analytic solution.

EURASIP journal on bioinformatics & systems biology·2017
Same journal

Review of stochastic hybrid systems with applications in biological systems modeling and analysis.

EURASIP journal on bioinformatics & systems biology·2017
Same journal

Using multi-step proposal distribution for improved MCMC convergence in Bayesian network structure learning.

EURASIP journal on bioinformatics & systems biology·2017
Same journal

On biometric systems: electrocardiogram Gaussianity and data synthesis.

EURASIP journal on bioinformatics & systems biology·2017
Same journal

Biomedical informatics with optimization and machine learning.

EURASIP journal on bioinformatics & systems biology·2017
See all related articles

Designing a classifier using the full sample dataset consistently yields better results than using a holdout subset for error estimation. Full-sample design offers a superior classifier with a smaller expected error bound.

Area of Science:

  • Machine Learning
  • Statistical Learning Theory
  • Data Science

Background:

  • Classifier design and error estimation are critical in machine learning.
  • The choice between using a full sample or a holdout test subset impacts both classifier performance and error estimation accuracy.

Purpose of the Study:

  • To compare the performance of full-sample classifier design versus holdout design for error estimation.
  • To determine which design strategy offers a smaller expected error bound under a conservative criterion.

Main Methods:

  • The study employed covariance models and a patient-data model for analysis.
  • A criterion based on achieving a classifier error below a given bound was used for comparison.
  • The expected bound was decomposed into expected true error and expected conditional standard deviation of true error.

Related Experiment Videos

Main Results:

  • Full-sample design consistently outperformed holdout design in terms of the expected error bound.
  • The analysis revealed a clear relationship between full-sample and holdout designs through error decomposition.

Conclusions:

  • Full-sample classifier design is superior to holdout design for achieving better classifiers and more reliable error bounds.
  • The findings provide a theoretical basis for preferring full-sample design in machine learning applications.