Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Pathway Representation via Intrinsic Structural Medoids (PRISM): A Structural Mapping Approach to Clustering Molecular Pathways.

bioRxiv : the preprint server for biology·2026
Same author

A New Family of Seniority-Restricted Coupled Cluster Methods.

The journal of physical chemistry. A·2026
Same author

Exploring New Construction Schemes for Extended-Hierarchy Configuration-Interaction Wave Functions.

The journal of physical chemistry. A·2026
Same author

Efficient exploration of peptide libraries using active learning with AlphaFold-based screening.

bioRxiv : the preprint server for biology·2026
Same author

Scaling <i>k</i>-Means for Multi-Million Frames: A Stratified NANI Approach for Large-Scale MD Simulations.

Journal of chemical information and modeling·2026
Same author

mdBIRCH for Fast, Scalable, Online Clustering of Molecular Dynamics Trajectories.

Journal of chemical theory and computation·2026
Same journal

Layered social competition coordinates reproductive hierarchy formation in ants.

bioRxiv : the preprint server for biology·2026
Same journal

Combination epigenetic-targeted therapy increases the immunogenicity of poorly immunogenic sarcomas.

bioRxiv : the preprint server for biology·2026
Same journal

Loss of LanC-like proteins delays post-injury regeneration of aging skeletal muscles.

bioRxiv : the preprint server for biology·2026
Same journal

Integrative Transfer Network: Deep Transfer Learning Across Populations and Prediction Targets.

bioRxiv : the preprint server for biology·2026
Same journal

Confidence-supported label-free metabolic imaging with FPhaS phase autofluorescence microscopy.

bioRxiv : the preprint server for biology·2026
Same journal

Sequence-encoded autoinhibition couples mRNA decapping activity to phase separation.

bioRxiv : the preprint server for biology·2026
See all related articles

Related Experiment Video

Updated: Apr 18, 2026

Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples
13:26

Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples

Published on: April 17, 2015

11.1K

Best practices to cluster large molecular libraries.

Kenneth Lopez-Perez1, Ramon Alain Miranda-Quintana1

  • 1Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32603, United States.

Biorxiv : the Preprint Server for Biology
|April 17, 2026
PubMed
Summary
This summary is machine-generated.

This study optimizes BitBIRCH clustering parameters for large molecular libraries. Data-driven strategies using similarity thresholds and iterative re-clustering enhance robustness and reduce singletons, improving analysis.

More Related Videos

gDNA Enrichment by a Transposase-based Technology for NGS Analysis of the Whole Sequence of BRCA1, BRCA2, and 9 Genes Involved in DNA Damage Repair
08:15

gDNA Enrichment by a Transposase-based Technology for NGS Analysis of the Whole Sequence of BRCA1, BRCA2, and 9 Genes Involved in DNA Damage Repair

Published on: October 6, 2014

12.8K
High-Density DNA and RNA microarrays - Photolithographic Synthesis, Hybridization and Preparation of Large Nucleic Acid Libraries
11:22

High-Density DNA and RNA microarrays - Photolithographic Synthesis, Hybridization and Preparation of Large Nucleic Acid Libraries

Published on: August 12, 2019

19.3K

Related Experiment Videos

Last Updated: Apr 18, 2026

Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples
13:26

Automated Gel Size Selection to Improve the Quality of Next-generation Sequencing Libraries Prepared from Environmental Water Samples

Published on: April 17, 2015

11.1K
gDNA Enrichment by a Transposase-based Technology for NGS Analysis of the Whole Sequence of BRCA1, BRCA2, and 9 Genes Involved in DNA Damage Repair
08:15

gDNA Enrichment by a Transposase-based Technology for NGS Analysis of the Whole Sequence of BRCA1, BRCA2, and 9 Genes Involved in DNA Damage Repair

Published on: October 6, 2014

12.8K
High-Density DNA and RNA microarrays - Photolithographic Synthesis, Hybridization and Preparation of Large Nucleic Acid Libraries
11:22

High-Density DNA and RNA microarrays - Photolithographic Synthesis, Hybridization and Preparation of Large Nucleic Acid Libraries

Published on: August 12, 2019

19.3K

Area of Science:

  • Computational chemistry
  • Chemoinformatics
  • Bioinformatics

Background:

  • BitBIRCH is a clustering algorithm for large molecular libraries.
  • Its performance can be limited by excessive singletons or large clusters.

Purpose of the Study:

  • To develop a data-driven strategy for optimizing BitBIRCH parameters.
  • To mitigate limitations like singletons and large clusters in molecular library analysis.

Main Methods:

  • Utilized the ChEMBL34 library for case studies.
  • Identified optimal similarity thresholds (3-4 standard deviations above mean).
  • Employed iSIM and iSIM-sigma frameworks for approximation.
  • Investigated the impact of branching factor (up to 1024).
  • Introduced an iterative re-clustering procedure.

Main Results:

  • Similarity thresholds between 3-4 standard deviations balance cluster count and medoid similarity.
  • High branching factors (e.g., 1024) significantly reduce singletons.
  • Iterative re-clustering allows user-defined control over cluster fusion.

Conclusions:

  • The proposed strategy enhances BitBIRCH's robustness for large-scale molecular clustering.
  • Practical guidelines are provided for optimizing BitBIRCH parameters.
  • Improved usability of BitBIRCH for analyzing vast molecular datasets.