Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

27
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
27
Drug Discovery: Overview01:26

Drug Discovery: Overview

7.4K
Drug discovery is a multifaceted process involving extensive screening, testing, and optimization of lead compounds to identify potential new drugs for therapeutic use. It combines several approaches, including screening large numbers of natural products, chemical modification of known active molecules, identification of new drug targets, and rational design based on biological mechanisms and drug-receptor structure. These approaches are carried out in both academic research laboratories and...
7.4K
Pharmacokinetic Models: Comparison and Selection Criterion01:26

Pharmacokinetic Models: Comparison and Selection Criterion

38
Physiological and compartmental models are valuable tools used in studying biological systems. These models rely on differential equations to maintain mass balance within the system, ensuring an accurate representation of the dynamic processes at play.
Physiological models take a detailed approach by considering specific molecular processes. They can predict drug distribution, metabolism, and elimination changes, providing a comprehensive understanding of how drugs interact with the body.
38
Ligand Binding Sites02:40

Ligand Binding Sites

12.7K
Proteins are dynamic macromolecules that carry out a wide variety of essential processes; however, the activities of most proteins depend on their interactions with other molecules or ions, known as ligands.
Protein-ligand interactions are quite specific; even though numerous potential ligands surround a cellular protein at any given time, only a particular ligand can bind to that protein. Moreover, a ligand binds only to a dedicated area on the surface of the protein, known as the...
12.7K
Conserved Binding Sites01:49

Conserved Binding Sites

4.2K
Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...
4.2K
The Equilibrium Binding Constant and Binding Strength02:18

The Equilibrium Binding Constant and Binding Strength

12.8K
The equilibrium binding constant (Kb) quantifies the strength of a protein-ligand interaction. Kb can be calculated as follows when the reaction is at equilibrium:
12.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Dynamic Mortality Risk Prediction in Myelodysplastic Syndromes Using Longitudinal Clinical Data.

JCO clinical cancer informatics·2025
Same author

Sequence-to-graph alignment based copy number calling using a network flow formulation.

bioRxiv : the preprint server for biology·2025
Same author

Barcode-free hit discovery from massive libraries enabled by automated small molecule structure annotation.

Nature communications·2025
Same author

Enhancing tandem mass spectrometry-based metabolite annotation with online chemical labeling.

Nature communications·2025
Same author

Somatic Mutations in HLA Class Genes and Antigen-Presenting Molecules in Malignant Glioma.

Cancer immunology research·2025
Same author

Discovery of metabolites prevails amid in-source fragmentation.

Nature metabolism·2025
Same journal

Interplay between oxygen redox and interfacial stability of Li-rich positive electrodes in sulfide-based all-solid-state batteries.

Nature communications·2026
Same journal

Breaking dependence on melanisation imparts diversity to a dogmatic invasion strategy of phytopathogenic fungi.

Nature communications·2026
Same journal

Hydroxyl-rich nanocavities on perovskite enable nearly barrierless intramolecular hydrogen transfer for nitrate electroreduction to ammonia.

Nature communications·2026
Same journal

Household mobility responses to weather extremes in Kyrgyzstan.

Nature communications·2026
Same journal

Autonomous Motion Vision with Tri-bulk-heterojunctioned Organic Adaptation Transistor.

Nature communications·2026
Same journal

Tissue-adhesive hydrogel optical fiber for peripheral optogenetic neuromodulation.

Nature communications·2026
See all related articles

Related Experiment Video

Updated: Jun 3, 2025

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA
10:21

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA

Published on: February 23, 2024

2.3K

Coverage bias in small molecule machine learning.

Fleming Kretschmer1, Jan Seipp2, Marcus Ludwig1,3

  • 1Chair for Bioinformatics, Institute for Computer Science, Friedrich Schiller University Jena, Jena, Germany.

Nature Communications
|January 9, 2025
PubMed
Summary
This summary is machine-generated.

Machine learning models for small molecules often lack coverage of biomolecular structures. This study introduces a new method to assess dataset coverage, improving model performance by guiding future data creation.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K

Related Experiment Videos

Last Updated: Jun 3, 2025

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA
10:21

Author Spotlight: Streamlining Protein Target Prediction and Validation via Molecular Docking and CETSA

Published on: February 23, 2024

2.3K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K

Area of Science:

  • Computational chemistry
  • cheminformatics
  • machine learning

Background:

  • Small molecule machine learning predicts properties from structures for applications like toxicity and drug discovery.
  • End-to-end models are trending, but often overlook the domain of applicability and data coverage bias.

Purpose of the Study:

  • To investigate the coverage of biomolecular structure space in large-scale datasets used for machine learning.
  • To develop methods for assessing dataset representativeness and guiding future dataset creation.

Main Methods:

  • Proposed a novel distance measure based on the Maximum Common Edge Subgraph (MCES) problem to quantify chemical similarity.
  • Developed an efficient computational approach combining Integer Linear Programming and heuristic bounds to solve the MCES problem.

Main Results:

  • Found that many widely-used datasets exhibit non-uniform coverage of biomolecular structures.
  • This lack of uniform coverage limits the predictive power of machine learning models trained on these datasets.

Conclusions:

  • Dataset coverage is a critical, often overlooked, factor in small molecule machine learning.
  • The proposed MCES-based distance and divergence assessment methods can guide the creation of more representative datasets, enhancing model performance.