Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

A stability based method for discovering structure in clustered data.

Asa Ben-Hur1, Andre Elisseeff, Isabelle Guyon

  • 1BioWulf Technologies LLC, 2030 Addison st. Suite 102, Berkeley, CA 94704, USA.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
|April 4, 2002
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

A comprehensive evaluation of self-attention for detecting regulatory feature interactions.

NAR genomics and bioinformatics·2026
Same author

CaDAVEr: a metagenome-assembled genome catalog of microbial decomposers across vertebrate environments.

Microbiology resource announcements·2025
Same author

The role of chromatin state in intron retention: A case study in leveraging large scale deep learning models.

PLoS computational biology·2025
Same author

The spectrum of pre-mRNA splicing in autism.

Wiley interdisciplinary reviews. RNA·2024
Same author

A conserved interdomain microbial network underpins cadaver decomposition despite environmental variables.

Nature microbiology·2024
Same journal

Trust, Reproducibility, and Progress: The Roles of Independent Blind Prediction and Assessment and Benchmarking in Computational Biology.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

The Evolving Cyberinfrastructure at the National Institutes of Health to Support Data and AI in Biomedical Research.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

Applications of AI & ML in Biomanufacturing of Cell and Gene Therapies.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

AI for Health: Leveraging Artificial Intelligence to Revolutionize Healthcare.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

Workshop Introduction: Advances of AI Methods in Single Cell Spatial Omics.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

DRIVE-KG: Enhancing variant-phenotype association discovery in understudied complex diseases using heterogeneous knowledge graphs.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
See all related articles

We developed a novel method to assess data structure using clustering stability. This approach helps determine the optimal number of clusters and detect the absence of patterns in data.

Area of Science:

  • Data Science
  • Computational Biology
  • Statistics

Background:

  • Clustering algorithms are widely used for pattern discovery in data.
  • Assessing the reliability and significance of clustering results remains a challenge.
  • Determining the optimal number of clusters is often subjective.

Purpose of the Study:

  • To introduce a robust method for visually and quantitatively evaluating structure in clustered data.
  • To provide a principled approach for selecting the optimal number of clusters.
  • To enable the detection of genuine structure versus random patterns.

Main Methods:

  • The method measures the stability of clustering solutions under data perturbations.
  • Stability is assessed by analyzing the distribution of pairwise similarities between clusterings from data subsamples.

Related Experiment Videos

  • The approach is algorithm-agnostic and applicable to various clustering techniques.
  • Main Results:

    • High pairwise similarity distributions indicate stable and reliable clustering patterns.
    • The method successfully identified an optimal number of clusters in artificial and real-world datasets.
    • The technique demonstrated its ability to detect the absence of significant structure.

    Conclusions:

    • The proposed stability-based method offers a reliable way to assess clustering results.
    • It provides objective criteria for determining the number of clusters and validating data structure.
    • This approach enhances the interpretability and trustworthiness of clustering analyses in diverse fields.