Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Model-based clustering and data transformations for gene expression data.

K Y Yeung1, C Fraley, A Murua

  • 1Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195, USA. kayee@cs.washington.edu

Bioinformatics (Oxford, England)
|October 24, 2001
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

BIC extensions for order-constrained model selection.

Sociological methods & research·2022
Same author

Acute poisoning by dexmedetomidine-containing chewing gum in a child.

Pathology·2021
Same author

Consistency for the tree bootstrap in respondent-driven sampling.

Biometrika·2020
Same author

A prospective interventional study to examine the effect of a silver alloy and hydrogel-coated catheter on the incidence of catheter-associated urinary tract infection.

Hong Kong medical journal = Xianggang yi xue za zhi·2017
Same author

The regulation of mitochondrial DNA copy number in glioblastoma cells.

Cell death and differentiation·2013
Same author

Isolation and identification of bacteriocinogenic strain of Lactobacillus plantarum with potential beneficial properties from donkey milk.

Journal of applied microbiology·2013
Same journal

Biomedical Concept Recognition with Error-aware Negative-enhanced Ranking Framework.

Bioinformatics (Oxford, England)·2026
Same journal

TEDLH: Domain HMMs for sensitive detection of remote homologues.

Bioinformatics (Oxford, England)·2026
Same journal

PLNFGL: Joint Estimation of Multi-Condition Gene Networks from Single-cell RNA-seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

MCFST: Spatial domain identification method based on multi-view graph convolutional network and graph fusion network.

Bioinformatics (Oxford, England)·2026
Same journal

SpaBiT: Enhancing Spatial Transcriptomics Resolution via Bidirectional Attention Transformers.

Bioinformatics (Oxford, England)·2026
Same journal

EDEL: Enhancing Dense Retrievers for Curation of Biomedical Knowledge Bases.

Bioinformatics (Oxford, England)·2026
See all related articles

Model-based clustering, using Gaussian mixture models, effectively analyzes gene expression data. This approach outperforms heuristic methods by accurately identifying the number of clusters and appropriate models, even with data transformations.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Statistical Genetics

Background:

  • Clustering is vital for exploratory analysis of gene expression data.
  • Heuristic algorithms are common, but probability model-based clustering offers a principled alternative.
  • Gaussian mixture models are powerful for clustering applications.

Purpose of the Study:

  • To benchmark model-based clustering against heuristic methods for gene expression data.
  • To evaluate the performance of Gaussian mixture models in identifying clusters and their number.
  • To assess the validity of Gaussian mixture assumptions on transformed gene expression data.

Main Methods:

  • Benchmarking model-based clustering using Gaussian mixture models.
  • Utilizing synthetic and real gene expression datasets with external evaluation criteria.

Related Experiment Videos

  • Assessing data fit to multivariate Gaussian distributions before and after transformations.
  • Main Results:

    • Model-based clustering demonstrated superior performance on synthetic data, correctly identifying models and cluster numbers.
    • On real data, it yielded comparable cluster quality to heuristic methods, with added benefits of model and cluster number suggestion.
    • Data transformations can lead to reasonable fits for Gaussian mixture models on gene expression data.

    Conclusions:

    • Model-based clustering, particularly with Gaussian mixture models, is a robust and informative approach for gene expression analysis.
    • It provides a principled framework for model selection and determining the optimal number of clusters.
    • The method offers advantages over heuristic approaches, especially when dealing with complex data structures.