Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.

Susmita Datta1, Somnath Datta

  • 1Department of Bioinformatics and Biostatistics, University of Louisville, Louisville, KY 40202, USA. susmita.datta@louisville.edu

BMC Bioinformatics
|September 2, 2006
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Identifying Single-Cell Expression Quantitative Trait Loci Using a Bootstrap Penalized Hurdle Model.

Genes·2026
Same author

A Novel Bioinformatics Pipeline and a Machine-Learning Approach for Antimicrobial Resistance Phenotypic Prediction.

Bioinformatics and biology insights·2026
Same author

Unpacking X (formerly Twitter) discourse on fluoride and related topics during the 2024 US presidential election.

Journal of the American Dental Association (1939)·2026
Same author

Deciphering sepsis molecular subtypes using large-scale data to identify subtype-specific drug repurposing.

bioRxiv : the preprint server for biology·2026
Same author

Nonparametric estimation of a state entry time distribution conditional on a "past" state occupation in a progressive multistate model with current status data.

Lifetime data analysis·2026
Same author

Sociodemographic differences in clinical phenotypes among patients with COPD: a latent class analysis.

BMJ open respiratory research·2026
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

This study introduces two new metrics, biological homogeneity index (BHI) and biological stability index (BSI), to evaluate gene clustering algorithms. These indices help identify the best clustering methods for gene expression data analysis.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Cluster analysis is a common first step in analyzing gene expression profiles.
  • Post hoc analysis often assesses functional correlation of clustered genes, but results can be misleading.
  • Systematic evaluation of unsupervised clustering is needed due to potential unrelated genes within clusters.

Purpose of the Study:

  • To quantify the performance of unsupervised clustering algorithms in producing biologically meaningful gene clusters.
  • To introduce novel performance measures for evaluating clustering algorithms.
  • To compare the effectiveness of different clustering algorithms on gene expression data.

Main Methods:

  • Introduced two performance measures: Biological Homogeneity Index (BHI) and Biological Stability Index (BSI).

Related Experiment Videos

  • Evaluated ten clustering algorithms on two gene expression datasets (breast cancer SAGE profiles and yeast sporulation).
  • Utilized functional classes from prior knowledge and gene ontology databases for evaluation.
  • Main Results:

    • BHI measures the biological homogeneity within clusters.
    • BSI measures the consistency of producing biologically meaningful clusters across similar datasets.
    • Identified optimal clustering algorithms for both evaluated datasets based on BHI and BSI.

    Conclusions:

    • Functional information from gene ontology databases can systematically evaluate unsupervised clustering results.
    • This evaluation aids in selecting the most appropriate clustering algorithm for specific gene expression datasets.
    • The developed indices provide a robust framework for assessing the biological relevance of gene clusters.