Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

A classification model for the Leiden proteomics competition.

Huub C J Hoefsloot¹, Suzanne Smit, Age K Smilde

¹University of Amsterdam. h.c.j.hoefsloot@uva.nl

Statistical Applications in Genetics and Molecular Biology

|March 4, 2008

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Analysis of microbiome high-dimensional experimental design data using generalized linear models and ANOVA simultaneous component analysis.

Frontiers in microbiomes·2026

Same author

ACMTF-R: Supervised multi-omics data integration uncovering shared and distinct outcome-associated variation.

PloS one·2026

Same author

Improved epigenetic age prediction models by combining sex chromosome and autosomal markers.

Epigenetics & chromatin·2025

Same author

Prioritization strategies for non-target screening in environmental samples by chromatography - High-resolution mass spectrometry: A tutorial.

Journal of chromatography. A·2025

Same author

Longitudinal Metabolomics Data Analysis Informed by Mechanistic Models.

Metabolites·2025

Same author

Parameter Dependency of Electrochemical Reduction of CO<sub>2</sub> in Acetonitrile - A Data Driven Approach.

Chemphyschem : a European journal of chemical physics and physical chemistry·2024

Same journal

Balanced mediated pathway detection in genomic data.

Statistical applications in genetics and molecular biology·2026

Same journal

Annealed variational mixtures for disease subtyping and biomarker discovery.

Statistical applications in genetics and molecular biology·2026

Same journal

Performance of the permutation test approach with base calling errors for detecting changes in variant allele frequencies in ctDNA for a single patient.

Statistical applications in genetics and molecular biology·2026

Same journal

BLOG: Bayesian longitudinal omics with group constraints.

Statistical applications in genetics and molecular biology·2026

Same journal

AI-driven risk prediction and categorization in cystic fibrosis leveraging AttentiveLSTM and Fox Wolf Optimizer.

Statistical applications in genetics and molecular biology·2026

Same journal

Perfect collinearity not created equal: measuring and visualizing the severity of multi-collinearity of modern omics data.

Statistical applications in genetics and molecular biology·2026

See all related articles

A new strategy enhances discrimination models in proteomics using cross-validation and rank products, effectively identifying biomarkers for breast cancer detection in undersampled datasets.

Area of Science:

Proteomics and Bioinformatics
Biomarker Discovery
Statistical Modeling

Background:

Proteomics studies often face a low samples-to-variables ratio, posing challenges for building accurate discrimination models.
Effective variable selection and robust validation are crucial for identifying reliable biomarkers in complex biological data.

Purpose of the Study:

To develop and validate a robust strategy for building discrimination models in proteomics, particularly for undersampled datasets.
To identify potential serum biomarkers for breast cancer detection using a novel modeling approach.

Main Methods:

A discrimination model was built using a combination of cross-validation and the rank products variable selection method.
Principal Component Discriminant Analysis was employed as the classification method, with the strategy adaptable to other classifiers.

Related Experiment Videos

A majority voting scheme from an ensemble classifier was used for final classification.

Main Results:

The developed strategy demonstrated high performance in a dataset of serum samples from breast cancer patients and healthy controls.
Double cross-validation yielded a model sensitivity of 82% and a specificity of 86%.
The variable selection method successfully identified potential putative biomarkers associated with breast cancer.

Conclusions:

The presented strategy offers a powerful approach for biomarker discovery in proteomics, especially in challenging undersampled scenarios.
The method provides a sensitive and specific model for discriminating between breast cancer patients and healthy controls.
The identified potential biomarkers warrant further investigation for clinical application in breast cancer diagnostics.