Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Model-based clustering and data transformations for gene expression data.

K Y Yeung¹, C Fraley, A Murua

¹Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195, USA. kayee@cs.washington.edu

Bioinformatics (Oxford, England)

|October 24, 2001

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

BIC extensions for order-constrained model selection.

Sociological methods & research·2022

Same author

Acute poisoning by dexmedetomidine-containing chewing gum in a child.

Pathology·2021

Same author

Consistency for the tree bootstrap in respondent-driven sampling.

Biometrika·2020

Same author

A prospective interventional study to examine the effect of a silver alloy and hydrogel-coated catheter on the incidence of catheter-associated urinary tract infection.

Hong Kong medical journal = Xianggang yi xue za zhi·2017

Same author

The regulation of mitochondrial DNA copy number in glioblastoma cells.

Cell death and differentiation·2013

Same author

Isolation and identification of bacteriocinogenic strain of Lactobacillus plantarum with potential beneficial properties from donkey milk.

Journal of applied microbiology·2013

Same journal

Biomedical Concept Recognition with Error-aware Negative-enhanced Ranking Framework.

Bioinformatics (Oxford, England)·2026

Same journal

TEDLH: Domain HMMs for sensitive detection of remote homologues.

Bioinformatics (Oxford, England)·2026

Same journal

PLNFGL: Joint Estimation of Multi-Condition Gene Networks from Single-cell RNA-seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

MCFST: Spatial domain identification method based on multi-view graph convolutional network and graph fusion network.

Bioinformatics (Oxford, England)·2026

Same journal

SpaBiT: Enhancing Spatial Transcriptomics Resolution via Bidirectional Attention Transformers.

Bioinformatics (Oxford, England)·2026

Same journal

EDEL: Enhancing Dense Retrievers for Curation of Biomedical Knowledge Bases.

Bioinformatics (Oxford, England)·2026

See all related articles

Model-based clustering, using Gaussian mixture models, effectively analyzes gene expression data. This approach outperforms heuristic methods by accurately identifying the number of clusters and appropriate models, even with data transformations.

Area of Science:

Bioinformatics
Computational Biology
Statistical Genetics

Background:

Clustering is vital for exploratory analysis of gene expression data.
Heuristic algorithms are common, but probability model-based clustering offers a principled alternative.
Gaussian mixture models are powerful for clustering applications.

Purpose of the Study:

To benchmark model-based clustering against heuristic methods for gene expression data.
To evaluate the performance of Gaussian mixture models in identifying clusters and their number.
To assess the validity of Gaussian mixture assumptions on transformed gene expression data.

Main Methods:

Benchmarking model-based clustering using Gaussian mixture models.
Utilizing synthetic and real gene expression datasets with external evaluation criteria.

Related Experiment Videos

Assessing data fit to multivariate Gaussian distributions before and after transformations.

Main Results:

Model-based clustering demonstrated superior performance on synthetic data, correctly identifying models and cluster numbers.
On real data, it yielded comparable cluster quality to heuristic methods, with added benefits of model and cluster number suggestion.
Data transformations can lead to reasonable fits for Gaussian mixture models on gene expression data.

Conclusions:

Model-based clustering, particularly with Gaussian mixture models, is a robust and informative approach for gene expression analysis.
It provides a principled framework for model selection and determining the optimal number of clusters.
The method offers advantages over heuristic approaches, especially when dealing with complex data structures.