Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Similarity by compression.

James L Melville1, Jenna F Riley, Jonathan D Hirst

  • 1School of Chemistry, University of Nottingham, University Park, Nottingham NG7 2RD, UK.

Journal of Chemical Information and Modeling
|January 24, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Balancing optimism versus potential risks of AI-driven drug discovery.

Expert opinion on drug discovery·2026
Same author

PyMolGen: Database-Driven Molecular Generation of Drug-Like Compounds.

Journal of chemical information and modeling·2026
Same author

Proposed Biosynthesis of the Complex Ring-Fused Diterpene Rameswaralide. Mechanistic Insights Using Density Functional Theory.

The Journal of organic chemistry·2026
Same author

Rega: A Platform for the Prediction of the Regioselectivity of C-H Functionalization Reactions.

Journal of chemical information and modeling·2026
Same author

DyeDactic workflow to predict halochromism of biosynthetic colourants.

Communications chemistry·2026
Same author

Explainable random forest predictions of polyester biodegradability using high-throughput biodegradation data.

Chemical science·2025
Same journal

PFASGroups: An Open-Source Framework for Automated Identification, Structural Classification, and Prioritization of Per- and Polyfluoroalkyl Substances.

Journal of chemical information and modeling·2026
Same journal

DeepKbhb: Context-Aware Prediction of Human Lysine β-Hydroxybutyrylation Sites.

Journal of chemical information and modeling·2026
Same journal

HyperDC: A Non-Uniform Hypergraph Framework for Dual- and Higher-Order Drug Combination Recommendation Across Diverse Complex Diseases.

Journal of chemical information and modeling·2026
Same journal

Correction to "AstraMEV (AI-Guided Structural Assembly of Multi-Epitope Vaccines) Against Infectious Bronchitis Virus".

Journal of chemical information and modeling·2026
Same journal

MolPy: A Large Language Model-Friendly Toolkit for Reactive Topology Editing in Polymer Simulations.

Journal of chemical information and modeling·2026
Same journal

Molecular Mechanisms of KIT Receptor Dimerization and Oncogenic Activation Revealed by Multiscale Simulations.

Journal of chemical information and modeling·2026
See all related articles

We developed a straightforward method for molecular similarity searching using data compression. This approach, based on normalized compression distance, outperforms traditional methods in virtual high-throughput screening.

Area of Science:

  • Computational chemistry
  • Bioinformatics
  • Cheminformatics

Background:

  • Virtual high-throughput screening (vHTS) is crucial for drug discovery.
  • Current similarity searching methods often require complex data structures and algorithms.
  • There is a need for simpler, more efficient similarity searching techniques.

Purpose of the Study:

  • To introduce a novel, simple, and effective method for similarity searching in vHTS.
  • To demonstrate the utility of normalized compression distance (NCD) for molecular similarity.
  • To compare the performance of NCD-based searching against established methods.

Main Methods:

  • Utilizing string-based molecular representations (e.g., SMILES).
  • Employing standard data compression software.

Related Experiment Videos

  • Calculating normalized compression distance (NCD) as a measure of similarity, approximating normalized information distance and Kolmogorov complexity.
  • Main Results:

    • Compression-based similarity searching demonstrated superior performance compared to standard protocols.
    • The Tanimoto coefficient with binary fingerprints and data fusion was outperformed.
    • The method is computationally inexpensive and requires only basic software.

    Conclusions:

    • Normalized compression distance offers a powerful and accessible approach for molecular similarity searching.
    • This method provides a viable and potentially superior alternative for virtual high-throughput screening.
    • The simplicity and effectiveness of this technique facilitate broader application in drug discovery.