Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Microbial Classification System01:24

Microbial Classification System

1.8K
Classification is the process of organizing organisms into hierarchically inclusive groups based on their phenotypic similarities or evolutionary relationships. A species comprises one or more strains, and closely related species are grouped into genera. Genera are further classified into families, families into orders, orders into classes, and so forth, up to the domain level, which is the broadest taxonomic rank derived from a combination of phenotypic and genotypic data.The nomenclature of...
1.8K
Modern Molecular Taxonomy01:29

Modern Molecular Taxonomy

834
Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...
834
Applications of Molecular Taxonomy01:20

Applications of Molecular Taxonomy

705
Molecular taxonomy has revolutionized the understanding and classification of bacteria, providing precise insights into their diversity, evolutionary relationships, and ecological roles. By utilizing molecular techniques such as DNA sequencing and fingerprinting, researchers have made significant strides in various fields related to bacterial studies.Resolving Taxonomic AmbiguitiesMolecular taxonomy has been instrumental in distinguishing closely related bacterial species initially thought to...
705

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Deep Learning-Driven Saccharide Online Sequencing for Elucidating the Pathological Alterations of Heparan Sulfate in APAP-Induced Acute Liver Injury.

Analytical chemistry·2026
Same author

Pathway Representation via Intrinsic Structural Medoids (PRISM): A Structural Mapping Approach to Clustering Molecular Pathways.

bioRxiv : the preprint server for biology·2026
Same author

A New Family of Seniority-Restricted Coupled Cluster Methods.

The journal of physical chemistry. A·2026
Same author

Exploring New Construction Schemes for Extended-Hierarchy Configuration-Interaction Wave Functions.

The journal of physical chemistry. A·2026
Same author

Efficient exploration of peptide libraries using active learning with AlphaFold-based screening.

bioRxiv : the preprint server for biology·2026
Same author

Scaling <i>k</i>-Means for Multi-Million Frames: A Stratified NANI Approach for Large-Scale MD Simulations.

Journal of chemical information and modeling·2026
Same journal

Genetic Impacts on Variability of Body Fat Distribution Uncover Gene-Environment and Gene-Gene Interactions.

bioRxiv : the preprint server for biology·2026
Same journal

16S ribosomal RNA modification drives transcript-specific translation efficiency.

bioRxiv : the preprint server for biology·2026
Same journal

FlcE latches onto the FliL-stator complex to turbocharge flagellar motility in <i>Borrelia burgdorferi</i>.

bioRxiv : the preprint server for biology·2026
Same journal

Synaptic pruning, myelination and the emergence of psychiatric disorders in late adolescence.

bioRxiv : the preprint server for biology·2026
Same journal

Structural and functional insights into the Rcs phosphorelay.

bioRxiv : the preprint server for biology·2026
Same journal

The structural basis of RanGAP1 regulation and catalysis in nuclear transport.

bioRxiv : the preprint server for biology·2026
See all related articles

Related Experiment Video

Updated: Apr 28, 2026

Competitive Genomic Screens of Barcoded Yeast Libraries
11:59

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

18.3K

Efficient clustering of large molecular libraries.

Kenneth López Pérez1, Vicky Jung1, Lexin Chen1

  • 1Department of Chemistry & Quantum Theory Project, University of Florida, Gainesville, Florida 32611.

Biorxiv : the Preprint Server for Biology
|August 16, 2024
PubMed
Summary
This summary is machine-generated.

A new algorithm, BitBIRCH, efficiently clusters large molecular libraries using binary fingerprints and a tree structure. This machine learning approach significantly speeds up chemical space analysis without sacrificing cluster quality.

Keywords:
chemical diversitychemical spaceclusteringsimilarity

More Related Videos

Spatial Separation of Molecular Conformers and Clusters
10:37

Spatial Separation of Molecular Conformers and Clusters

Published on: January 9, 2014

8.9K
Computation of Atmospheric Concentrations of Molecular Clusters from ab initio Thermochemistry
12:11

Computation of Atmospheric Concentrations of Molecular Clusters from ab initio Thermochemistry

Published on: April 8, 2020

8.1K

Related Experiment Videos

Last Updated: Apr 28, 2026

Competitive Genomic Screens of Barcoded Yeast Libraries
11:59

Competitive Genomic Screens of Barcoded Yeast Libraries

Published on: August 11, 2011

18.3K
Spatial Separation of Molecular Conformers and Clusters
10:37

Spatial Separation of Molecular Conformers and Clusters

Published on: January 9, 2014

8.9K
Computation of Atmospheric Concentrations of Molecular Clusters from ab initio Thermochemistry
12:11

Computation of Atmospheric Concentrations of Molecular Clusters from ab initio Thermochemistry

Published on: April 8, 2020

8.1K

Area of Science:

  • Computational chemistry
  • Machine learning applications
  • Cheminformatics

Background:

  • Machine learning (ML) is crucial for analyzing large chemical datasets.
  • Clustering is a key technique for exploring chemical space.
  • Existing clustering methods struggle with the scale of modern molecular libraries (millions to billions of molecules).

Purpose of the Study:

  • Introduce BitBIRCH, a novel, efficient clustering algorithm for massive molecular datasets.
  • Address the time and memory limitations of current clustering approaches.
  • Enable scalable analysis of large chemical libraries.

Main Methods:

  • Developed BitBIRCH, a memory- and time-efficient clustering algorithm.
  • Utilized a tree structure inspired by the BIRCH algorithm for efficient scaling.
  • Employed the instant similarity (iSIM) formalism for processing binary molecular fingerprints.
  • Incorporated Tanimoto similarity for efficient similarity calculations.

Main Results:

  • BitBIRCH demonstrates superior performance, exceeding 1,000x speed improvement over Taylor-Butina for 1.5 million molecules.
  • The algorithm achieves significant reductions in memory requirements.
  • Cluster quality is maintained despite the efficiency gains.
  • Clustering of one billion molecules was achieved in under 5 hours using parallel/iterative BitBIRCH.

Conclusions:

  • BitBIRCH offers a scalable and efficient solution for clustering large molecular libraries.
  • The algorithm overcomes the limitations of traditional methods for big data in cheminformatics.
  • BitBIRCH facilitates advanced chemical space exploration and analysis.