Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 27, 2026

Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal
08:00

Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal

Published on: October 11, 2019

Clustering cancer gene expression data: a comparative study.

Marcilio C P de Souto1, Ivan G Costa, Daniel S A de Araujo

  • 1Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany. marcilio@dimap.ufrn.br

BMC Bioinformatics
|November 29, 2008
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

Cancer Survival Analysis01:21

Cancer Survival Analysis

Cancer survival analysis focuses on quantifying and interpreting the time from a key starting point, such as diagnosis or the initiation of treatment, to a specific endpoint, such as remission or death. This analysis provides critical insights into treatment effectiveness and factors that influence patient outcomes, helping to shape clinical decisions and guide prognostic evaluations. A cornerstone of oncology research, survival analysis tackles the challenges of skewed, non-normally...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Wnt-dependent spatiotemporal reprogramming of bone marrow niches drives fibrosis.

HemaSphere·2026
Same author

An oncogenic KRAS-driven secretome involving TNFα promotes niche preparation prior to pancreatic cancer onset.

Molecular cancer·2026
Same author

PHLOWER leverages single-cell multimodal data to infer complex, multi-branching cell differentiation trajectories.

Nature methods·2025
Same author

PILOT-GM-VAE: patient-level analysis of single-cell disease atlas with optimal transport of Gaussian mixture variational autoencoders.

Briefings in bioinformatics·2025
Same author

Inhibiting the alarmin-driven hematopoiesis-stromal cell crosstalk in primary myelofibrosis ameliorates bone marrow fibrosis.

HemaSphere·2025
Same author

Advances and challenges in cell-cell communication inference: a comprehensive review of tools, resources, and future directions.

Briefings in bioinformatics·2025
Same journal

SNPio: a Python interface for population genomic data processing.

BMC bioinformatics·2026
Same journal

SpaHNR: a spatial domain identification method via sparse attention-based hierarchical node representation and multi-view contrastive learning.

BMC bioinformatics·2026
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
See all related articles

Finite mixture of Gaussians and k-means clustering methods best identify cancer subtypes from gene expression data. Hierarchical methods, commonly used by clinicians, performed poorly in this large-scale analysis.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Clustering methods are crucial for discovering cancer subtypes using gene expression data.
  • A gap exists in large-scale evaluations comparing bioinformatician-proposed and clinician-preferred clustering techniques.
  • The medical community often favors traditional clustering approaches over novel computational methods.

Purpose of the Study:

  • To conduct the first large-scale evaluation of seven clustering methods and four proximity measures on 35 cancer gene expression datasets.
  • To identify the most effective clustering approaches for accurate cancer subtype discovery.
  • To establish a benchmark dataset for future comparisons of clustering algorithms.

Main Methods:

  • Analysis of 35 cancer gene expression datasets.

Related Experiment Videos

Last Updated: Jun 27, 2026

Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal
08:00

Analyzing Tumor Gene Expression Factors with the CorExplorer Web Portal

Published on: October 11, 2019

  • Evaluation of seven distinct clustering algorithms.
  • Assessment using four different proximity measures.
  • Validation of cluster number accuracy against true data structure.
  • Main Results:

    • Finite mixture of Gaussians and k-means demonstrated superior performance in recovering true data structures.
    • These top-performing methods showed the smallest discrepancies between actual and identified cluster numbers.
    • Hierarchical clustering methods, frequently used in clinical settings, exhibited lower recovery performance compared to other evaluated methods.

    Conclusions:

    • Finite mixture of Gaussians and k-means are recommended for cancer subtype discovery from gene expression data.
    • The study provides a valuable benchmark dataset for the objective assessment and comparison of clustering methods.
    • Findings suggest a need to reconsider the reliance on traditional hierarchical methods in favor of more computationally robust approaches.