Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Developing optimal prediction models for cancer classification using gene expression data.

Mat Soukup1, Jae K Lee

  • 1Department of Statistics, University of Virginia, Halsey Hall, Charlottesville, VA 22904-4135, USA. mjs5b@virginia.edu

Journal of Bioinformatics and Computational Biology
|August 4, 2004
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Fibrinogen triggers perivascular fibroblast activation in a mouse model of cortical ischemic stroke.

iScience·2025
Same author

Correction: Single-cell analysis of the cellular heterogeneity and interactions in the injured mouse spinal cord.

The Journal of experimental medicine·2025
Same author

Molecular pathology of acute spinal cord injury in middle-aged mice.

Journal of neuroinflammation·2025
Same author

Molecular pathology of acute spinal cord injury in middle-aged mice.

bioRxiv : the preprint server for biology·2025
Same author

OrthologAL: a Shiny application for quality-aware humanization of non-human pre-clinical high-dimensional gene expression data.

Bioinformatics (Oxford, England)·2025
Same author

The Whys, Whens, and Hows of Futility Monitoring.

Statistics in medicine·2025
Same journal

CNV-ECOD: A copy number variation detection method based on ECOD algorithm using next-generation sequencing data.

Journal of bioinformatics and computational biology·2026
Same journal

ReinVar: A model-free paradigm-based reinforcement learning approach to detect copy number variation.

Journal of bioinformatics and computational biology·2026
Same journal

When pipelines run but coordinates fail: A simple spatial specificity check for false locality in post-GWAS analysis.

Journal of bioinformatics and computational biology·2026
Same journal

Comparative benchmarking of template-based, evolutionary-diffusion, and generative language models for IsPETase structure prediction.

Journal of bioinformatics and computational biology·2026
Same journal

Trap spaces as labelled ideals of SCC posets: A structural-functional theory of reachability in asynchronous boolean networks.

Journal of bioinformatics and computational biology·2026
Same journal

Erratum - DDINet: Drug-drug interaction prediction network based on multi-molecular fingerprint features and multi-head attention centered weighted autoencoder.

Journal of bioinformatics and computational biology·2026
See all related articles

We developed a robust cancer sub-type classification model using gene expression data. This model uses fewer genes but performs as well or better than existing methods, especially on independent samples.

Area of Science:

  • Genomics and Bioinformatics
  • Cancer Research
  • Biostatistics

Background:

  • Microarrays provide genome-wide expression data crucial for understanding cancer sub-types and patient prognosis.
  • Existing gene expression-based classification methods for cancer sub-types lack robustness and often depend heavily on training data.
  • This dependence on specific training samples poses challenges for accurate classification and treatment of future patients.

Purpose of the Study:

  • To construct an optimal and robust prediction model for classifying cancer sub-types using gene expression data.
  • To develop a reliable model for future patient data by validating models with independent patient samples at each step.
  • To improve the accuracy and reliability of cancer sub-type classification compared to existing methods.

Main Methods:

Related Experiment Videos

  • A step-wise construction approach for the prediction model.
  • Implementation of cross-validated quadratic discriminant analysis (QDA) at each step.
  • Validation of all identified models using an independent sample of patients to ensure robustness.

Main Results:

  • The proposed method successfully classified cancer sub-types using two independent microarray datasets (acute leukemia and colon cancer).
  • The optimal prediction models achieved a relatively small dimensionality, utilizing only one or two gene factors.
  • These models demonstrated superior or competitive performance, particularly on independent samples, compared to methods using 50+ gene factors.

Conclusions:

  • The developed method provides an optimal and robust approach for cancer sub-type classification using gene expression data.
  • The model's high performance with minimal gene factors suggests increased efficiency and interpretability.
  • The methodology, implemented in R and Splus, offers a reliable tool for clinical application and future patient stratification.