Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

A hierarchical clustering approach for large compound libraries.

Alexander Böcker1, Swetlana Derksen, Elena Schmidt

  • 1Johann Wolfgang Goethe-Universität, Institut für Organische Chemie und Chemische Biologie, Marie-Curie-Str. 11, D-60439 Frankfurt, Germany.

Journal of Chemical Information and Modeling
|July 28, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The adaptation and psychometric validation of a stigma measure for adults diagnosed with severe vision impairment in rural Mozambique.

BMC psychology·2026
Same author

Factor structure and psychometric properties of an adapted HIV stigma tool for measuring disability-related stigma among smallholder farmers in Western Kenya - Findings from a cross-sectional study.

PloS one·2026
Same author

Significant gap between Point participation and long‑term treatment adherence: An evaluation of ivermectin MDA in the Kwanware‑Ottou persistent onchocerciasis transmission focus, Wenchi, Ghana.

PLoS neglected tropical diseases·2026
Same author

Study protocol for evaluating automation of systematic review processes with EPPI-Reviewer and Copilot 365 in updating the cataract evidence gap map.

Systematic reviews·2026
Same author

Intervention effectiveness reducing disability stigma in sub-Saharan Africa: Systematic review.

African journal of disability·2026
Same author

Reaching the last mile with ivermectin mass drug administration against onchocerciasis: The case of Kwanware-Ottou persistent transmission focus in the Wenchi health district of Ghana.

PLoS neglected tropical diseases·2026
Same journal

QSAR in the Browser: An Interactive Cheminformatics Web Application.

Journal of chemical information and modeling·2026
Same journal

FoldDoF: Utilizing the Primary Degrees of Freedom of Protein Backbone for Geometric Modeling and Generation.

Journal of chemical information and modeling·2026
Same journal

Derisking Affinity Optimization for Macrocycles and Cyclic Peptides: High-Precision Free Energy Simulations across Five Diverse Targets.

Journal of chemical information and modeling·2026
Same journal

An End-User Audit of Reproducibility, Data Leakage, and Overfitting of the Top-Ranked ADMET Prediction Models in TDC Leaderboards.

Journal of chemical information and modeling·2026
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
See all related articles

A novel k-means clustering algorithm efficiently analyzes large compound libraries for drug discovery. This method aids in generating focused compound libraries and identifying novel scaffolds for virtual screening.

Area of Science:

  • Computational Chemistry
  • Cheminformatics
  • Drug Discovery

Background:

  • Analyzing large compound libraries is crucial for identifying novel drug candidates.
  • Existing methods may struggle with the scale and complexity of modern chemical databases.

Purpose of the Study:

  • To develop a modified k-means clustering algorithm for analyzing large compound libraries.
  • To enable efficient ligand-based virtual screening and focused library generation.

Main Methods:

  • A modified k-means clustering algorithm with a distance threshold termination criterion.
  • Construction of hierarchical trees for data distribution overview.
  • Molecular encoding using Molecular Operating Environment (MOE) 2D descriptors and topological pharmacophore atom types.

Related Experiment Videos

Main Results:

  • The algorithm successfully analyzed large compound libraries, including the MDL Drug Data Report (MDDR) and Collection of Bioactive Reference Analogues (COBRA) databases.
  • Hierarchical trees revealed inherent cluster structures and data distribution.
  • Retrospective analysis showed significant enrichment of active compounds within specific clusters for caspase 1 inhibitors and glucocorticoid receptor ligands.
  • Novel scaffolds for ICE inhibitors were identified by clustering combined databases.

Conclusions:

  • The modified k-means clustering algorithm is effective for large-scale compound library analysis and virtual screening.
  • The method facilitates the generation of focused compound libraries and the discovery of new chemical scaffolds.
  • A Java implementation is publicly available for broader application in drug discovery research.