Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...

Drug Discovery: Overview

Drug Discovery: Overview

Drug discovery is a multifaceted process involving extensive screening, testing, and optimization of lead compounds to identify potential new drugs for therapeutic use. It combines several approaches, including screening large numbers of natural products, chemical modification of known active molecules, identification of new drug targets, and rational design based on biological mechanisms and drug-receptor structure. These approaches are carried out in both academic research laboratories and...

Pharmacokinetic Models: Comparison and Selection Criterion

Pharmacokinetic Models: Comparison and Selection Criterion

Physiological and compartmental models are valuable tools used in studying biological systems. These models rely on differential equations to maintain mass balance within the system, ensuring an accurate representation of the dynamic processes at play.
Physiological models take a detailed approach by considering specific molecular processes. They can predict drug distribution, metabolism, and elimination changes, providing a comprehensive understanding of how drugs interact with the body.

Ligand Binding Sites

Ligand Binding Sites

Proteins are dynamic macromolecules that carry out a wide variety of essential processes; however, the activities of most proteins depend on their interactions with other molecules or ions, known as ligands.
Protein-ligand interactions are quite specific; even though numerous potential ligands surround a cellular protein at any given time, only a particular ligand can bind to that protein. Moreover, a ligand binds only to a dedicated area on the surface of the protein, known as the...

Conserved Binding Sites

Conserved Binding Sites

Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...

The Equilibrium Binding Constant and Binding Strength

The Equilibrium Binding Constant and Binding Strength

The equilibrium binding constant (Kb) quantifies the strength of a protein-ligand interaction. Kb can be calculated as follows when the reaction is at equilibrium:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Dynamic Mortality Risk Prediction in Myelodysplastic Syndromes Using Longitudinal Clinical Data.

JCO clinical cancer informatics·2025

Same author

Sequence-to-graph alignment based copy number calling using a network flow formulation.

bioRxiv : the preprint server for biology·2025

Same author

Barcode-free hit discovery from massive libraries enabled by automated small molecule structure annotation.

Nature communications·2025

Same author

Enhancing tandem mass spectrometry-based metabolite annotation with online chemical labeling.

Nature communications·2025

Same author

Somatic Mutations in HLA Class Genes and Antigen-Presenting Molecules in Malignant Glioma.

Cancer immunology research·2025

Same author

Discovery of metabolites prevails amid in-source fragmentation.

Nature metabolism·2025

Same journal

Interplay between oxygen redox and interfacial stability of Li-rich positive electrodes in sulfide-based all-solid-state batteries.

Nature communications·2026

Same journal

Breaking dependence on melanisation imparts diversity to a dogmatic invasion strategy of phytopathogenic fungi.

Nature communications·2026

Same journal

Hydroxyl-rich nanocavities on perovskite enable nearly barrierless intramolecular hydrogen transfer for nitrate electroreduction to ammonia.

Nature communications·2026

Same journal

Household mobility responses to weather extremes in Kyrgyzstan.

Nature communications·2026

Same journal

Autonomous Motion Vision with Tri-bulk-heterojunctioned Organic Adaptation Transistor.

Nature communications·2026

Same journal

Tissue-adhesive hydrogel optical fiber for peripheral optogenetic neuromodulation.

Nature communications·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 3, 2025

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA

Published on: February 23, 2024

Coverage bias in small molecule machine learning.

Fleming Kretschmer¹, Jan Seipp², Marcus Ludwig^1,3

¹Chair for Bioinformatics, Institute for Computer Science, Friedrich Schiller University Jena, Jena, Germany.

Nature Communications

|January 9, 2025

Summary

This summary is machine-generated.

Machine learning models for small molecules often lack coverage of biomolecular structures. This study introduces a new method to assess dataset coverage, improving model performance by guiding future data creation.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Jun 3, 2025

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA

Published on: February 23, 2024

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Computational chemistry
cheminformatics
machine learning

Background:

Small molecule machine learning predicts properties from structures for applications like toxicity and drug discovery.
End-to-end models are trending, but often overlook the domain of applicability and data coverage bias.

Purpose of the Study:

To investigate the coverage of biomolecular structure space in large-scale datasets used for machine learning.
To develop methods for assessing dataset representativeness and guiding future dataset creation.

Main Methods:

Proposed a novel distance measure based on the Maximum Common Edge Subgraph (MCES) problem to quantify chemical similarity.
Developed an efficient computational approach combining Integer Linear Programming and heuristic bounds to solve the MCES problem.

Main Results:

Found that many widely-used datasets exhibit non-uniform coverage of biomolecular structures.
This lack of uniform coverage limits the predictive power of machine learning models trained on these datasets.

Conclusions:

Dataset coverage is a critical, often overlooked, factor in small molecule machine learning.
The proposed MCES-based distance and divergence assessment methods can guide the creation of more representative datasets, enhancing model performance.