Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Comparisons and validation of statistical clustering techniques for microarray gene expression data.

Susmita Datta1, Somnath Datta

  • 1Department of Mathematics and Statistics and Department of Biology, Georgia State University, Atlanta, GA 30303, USA. sdatta@mathstat.gsu.edu

Bioinformatics (Oxford, England)
|March 4, 2003
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Identifying Single-Cell Expression Quantitative Trait Loci Using a Bootstrap Penalized Hurdle Model.

Genes·2026
Same author

A Novel Bioinformatics Pipeline and a Machine-Learning Approach for Antimicrobial Resistance Phenotypic Prediction.

Bioinformatics and biology insights·2026
Same author

Unpacking X (formerly Twitter) discourse on fluoride and related topics during the 2024 US presidential election.

Journal of the American Dental Association (1939)·2026
Same author

Deciphering sepsis molecular subtypes using large-scale data to identify subtype-specific drug repurposing.

bioRxiv : the preprint server for biology·2026
Same author

Nonparametric estimation of a state entry time distribution conditional on a "past" state occupation in a progressive multistate model with current status data.

Lifetime data analysis·2026
Same author

Sociodemographic differences in clinical phenotypes among patients with COPD: a latent class analysis.

BMJ open respiratory research·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

This study evaluates six gene expression clustering algorithms for microarrays. The Diana algorithm generally performed well, outperforming hierarchical clustering (UPGMA) and model-based clustering in grouping genes with similar temporal patterns.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Systems Biology

Background:

  • Microarray technology generates large datasets of gene expression levels over time.
  • Grouping genes by temporal expression patterns is crucial for understanding biological processes.
  • Current guidelines for selecting gene clustering algorithms are lacking.

Purpose of the Study:

  • To evaluate the performance of six different clustering algorithms for gene expression data.
  • To develop and apply validation strategies for assessing clustering algorithm performance.
  • To identify robust clustering methods for analyzing temporal gene expression profiles.

Main Methods:

  • Evaluation of six clustering algorithms including hierarchical clustering (UPGMA) and Diana.

Related Experiment Videos

  • Application to a public microarray dataset of yeast sporulation and two simulated datasets.
  • Development and use of three validation strategies for temporal data.
  • Main Results:

    • The Diana algorithm demonstrated solid performance across various validation strategies.
    • Hierarchical clustering (UPGMA) and model-based clustering showed performance extremes depending on the validation measure.
    • Diana produced group means closest to a model profile, while UPGMA produced the farthest.

    Conclusions:

    • The choice of clustering algorithm impacts gene grouping results in microarray analysis.
    • Diana is a recommended algorithm for gene expression clustering due to its consistent performance.
    • Further research into algorithm selection criteria for gene expression data is warranted.