Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Model order selection for bio-molecular data clustering.

Alberto Bertoni1, Giorgio Valentini

  • 1DSI, Dipartimento di Scienze dell'Informazione, Università degli Studi di Milano, Milano, Italy. bertoni@dsi.unimi.it

BMC Bioinformatics
|May 12, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MiRInter-Trans: a transformer-based framework for microRNA interaction prediction.

Bioinformatics advances·2026
Same author

On the state of protein function prediction: a report on the fourth CAFA challenge.

bioRxiv : the preprint server for biology·2026
Same author

Computational understanding of non-coding RNA pairwise interactions.

Frontiers in artificial intelligence·2026
Same author

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools.

European journal of human genetics : EJHG·2026
Same author

Leveraging generative AI to assist biocuration of medical actions for rare disease.

Bioinformatics advances·2025
Same author

RNA knowledge-graph analysis through homogeneous embedding methods.

Bioinformatics advances·2025
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

This study introduces a novel stability method for cluster analysis in biomolecular data. It effectively identifies the optimal number of clusters and assesses their statistical significance, even in complex, high-dimensional datasets.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Data Mining

Background:

  • Cluster analysis is crucial for bio-molecular data but struggles to determine the optimal number of clusters.
  • Existing stability-based methods face challenges in assessing statistical significance and detecting multiple data structures.

Purpose of the Study:

  • To develop a novel stability-based method for automated cluster number selection in bio-molecular data.
  • To enable the assessment of statistical significance and detection of multi-level structures in high-dimensional data.

Main Methods:

  • A stability method utilizing randomized maps and subsets of randomized linear combinations of variables.
  • Stability indices derived from similarity measures of clusterings on projected data.
  • A chi2-based statistical test for significance assessment and multi-level structure detection.

Related Experiment Videos

Main Results:

  • The proposed method effectively exploits high-dimensionality and low cardinality of bio-molecular data.
  • Demonstrated ability to assess statistical significance of clustering solutions.
  • Successfully detected significant and multi-level structures in synthetic and gene expression data.

Conclusions:

  • The new model order selection methods are competitive with current state-of-the-art algorithms.
  • The approach is capable of identifying multiple hierarchical structures in complex bio-molecular datasets.