Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Molecular diagnosis. Classification, model selection and performance evaluation.

F Markowetz1, R Spang

  • 1Max Planck Institute for Molecular Genetics, Computational Diagnostics Group, Ihnestrasse 63-73, 14195 Berlin, Germany. florian.markowetz@molgen.mpg.de

Methods of Information in Medicine
|August 23, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Gene expression and copy number profiling of follicular lymphoma biopsies from patients treated with first-line rituximab without chemotherapy.

Leukemia & lymphoma·2023
Same author

BITES: balanced individual treatment effect for survival data.

Bioinformatics (Oxford, England)·2022
Same author

SPARC-positive macrophages are the superior prognostic factor in the microenvironment of diffuse large B-cell lymphoma and independent of MYC rearrangement and double-/triple-hit status.

Annals of oncology : official journal of the European Society for Medical Oncology·2021
Same author

Molecular signatures that can be transferred across different omics platforms.

Bioinformatics (Oxford, England)·2017
Same author

Molecular signatures that can be transferred across different omics platforms.

Bioinformatics (Oxford, England)·2017
Same author

Reference point insensitive molecular data analysis.

Bioinformatics (Oxford, England)·2016
Same journal

Design and methodological development of a digital clinical safety training programme informed by a national framework: a New Zealand case study.

Methods of information in medicine·2026
Same journal

Panic Prediction from Digital Phenotyping: Subject-Level Cross-Validation Reveals Limited Between-Person Generalization.

Methods of information in medicine·2026
Same journal

Agent-Based Modeling Approach for Population Dynamics of the Biological Vector Aedes Aegypti.

Methods of information in medicine·2026
Same journal

A Statistical Framework for Person-centered Analysis of Digital Service Use in Public Health and Social Care.

Methods of information in medicine·2026
Same journal

Assessing the Quality of Electronic Discharge Summaries: A Cross-Sectional Study Using the Validated Spanish Version of the PDQI-9.

Methods of information in medicine·2026
Same journal

A Knowledge Graph-Driven Hypergeometric Efficacy Prediction Model for Classical Traditional Chinese Herbal Formulas.

Methods of information in medicine·2026
See all related articles

Supervised classification for medical diagnosis using gene expression profiles requires adaptive model selection to prevent overfitting. Rigorous assessment of predictive power is crucial for reliable results in high-dimensional data.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Medical Informatics

Background:

  • Gene expression profiling generates high-dimensional data, posing challenges for accurate medical diagnosis.
  • Supervised classification techniques are increasingly applied to analyze gene expression data for disease identification.
  • Overfitting is a significant risk in high-dimensional spaces, potentially leading to unreliable diagnostic models.

Purpose of the Study:

  • To explore supervised classification techniques for medical diagnosis using gene expression profiles.
  • To focus on adaptive model selection strategies to mitigate overfitting in high-dimensional data.
  • To discuss methods for rigorous and unbiased assessment of predictive model performance.

Main Methods:

  • Introduced likelihood-based methods, classification trees, support vector machines, and regularized binary regression.

Related Experiment Videos

  • Described feature selection methods including filtering, shrinkage, and wrapper approaches for dimension reduction.
  • Addressed data re-use strategies for small sample sizes and discussed cross-validation issues like in-loop vs. out-of-loop feature selection and nested-loop parameter estimation.
  • Main Results:

    • Tuning parameters are key to achieving adaptive model selection.
    • Feature selection alone does not reduce model dimensionality.
    • Feature selection bias is a common pitfall in performance evaluation; nested-loop cross-validation can combine model selection and performance evaluation.

    Conclusions:

    • Classification of microarrays is highly susceptible to overfitting.
    • A robust and unbiased evaluation of model predictive power is essential for clinical application.
    • Careful model selection and validation are critical for trustworthy gene expression-based diagnostics.