Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...

Design Example: Setting a Curve Using Design Data

Design Example: Setting a Curve Using Design Data

Designing and plotting a curve using field data requires precise calculations and execution. A horizontal curve with a radius of 200 meters and an intersection angle of 20 degrees is established using the method of perpendicular offsets from the long chord. The long chord, which spans between the curve's endpoints, is calculated to be 69.46 meters in length. To maintain accuracy in plotting, intervals of 3 meters are selected along the chord.The engineer determines the offset distances for each...

Metal-Ligand Bonds

Metal-Ligand Bonds

The hemoglobin in the blood, the chlorophyll in green plants, vitamin B-12, and the catalyst used in the manufacture of polyethylene all contain coordination compounds. Ions of the metals, especially the transition metals, are likely to form complexes.
In these complexes, transition metals form coordinate covalent bonds, a kind of Lewis acid-base interaction in which both of the electrons in the bond are contributed by a donor (Lewis base) to an electron acceptor (Lewis acid). The Lewis acid in...

Ligand Binding Sites

Ligand Binding Sites

Proteins are dynamic macromolecules that carry out a wide variety of essential processes; however, the activities of most proteins depend on their interactions with other molecules or ions, known as ligands.
Protein-ligand interactions are quite specific; even though numerous potential ligands surround a cellular protein at any given time, only a particular ligand can bind to that protein. Moreover, a ligand binds only to a dedicated area on the surface of the protein, known as the...

Ligand Binding and Linkage

Ligand Binding and Linkage

Allosteric proteins have more than one ligand binding site; the binding of a ligand to any of these sites influences the binding of ligands to the other sites. When a protein is allosteric, its binding sites are called coupled or linked. In the case of enzymes, the site that binds to the substrate is known as the active site and the other site is known as the regulatory site. When a ligand binds to the regulatory site, this leads to conformational changes in the protein that can influence...

Random Error

Random Error

Random or indeterminate errors originate from various uncontrollable variables, such as variations in environmental conditions, instrument imperfections, or the inherent variability of the phenomena being measured. Usually, these errors cannot be predicted, estimated, or characterized because their direction and magnitude often vary in magnitude and direction even during consecutive measurements. As a result, they are difficult to eliminate. However, the aggregate effect of these errors can be...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cell painting and thermal proteome profiling for inference of drug targets and mechanism of action.

Molecular systems biology·2026

Same author

Benign-by-design chemistry: Reinventing ligand-based drug design at the edge of AI.

Drug discovery today·2026

Same author

AI agents in drug discovery: applications and case studies.

Drug discovery today·2026

Same author

Counting cells can accurately predict small-molecule bioactivity benchmarks.

Nature communications·2026

Same author

Molecular networking, conformal predictions and revised fingerprint-based models for discovering endocrine disruptors in mixtures.

Analytical and bioanalytical chemistry·2026

Same author

Classification of industrial chemicals for respiratory chemosensory irritation using the TRPV1-expressing neuronal SH-SY5Y cell model and machine learning.

Archives of toxicology·2026

Same journal

Unified heterogeneity-aware benchmark of drug synergy prediction: a cross-study analysis of traditional machine learning and graph deep learning models.

Journal of cheminformatics·2026

Same journal

Count your bits: fingerprint benchmarking to assess broad chemical space representation.

Journal of cheminformatics·2026

Same journal

Sampling out-of-distribution chemical spaces via Bayesian flow.

Journal of cheminformatics·2026

Same journal

Hold on tight: the kinetic profiling of opioid receptor ligands using the CORAL-MD.

Journal of cheminformatics·2026

Same journal

Transformer-accelerated discovery of inhibitors targeting the RpsA<sub>Δ438</sub> deletion in PZA-resistant tuberculosis.

Journal of cheminformatics·2026

Same journal

DICL: a manually curated database of ion channels and ligands as a useful platform for drug discovery targeting ion channels.

Journal of cheminformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 4, 2026

Collecting and Processing Drone-based Remotely Sensed Data for Use in Forest Recovery Monitoring

Collecting and Processing Drone-based Remotely Sensed Data for Use in Forest Recovery Monitoring

Published on: October 24, 2025

Evaluating parameters for ligand-based modeling with random forest on sparse data sets.

Alexander Kensert¹, Jonathan Alvarsson², Ulf Norinder^3,4

¹Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden. alexander.kensert@gmail.com.

Journal of Cheminformatics

|October 12, 2018

Summary

This summary is machine-generated.

Unhashed molecular fingerprints offer superior accuracy in ligand-based predictive modeling compared to hashed versions. The FEST algorithm provides efficient processing for large, sparse datasets, making it ideal for drug discovery.

Keywords:

Fingerprint Machine learning Random forest Sparse representation Support vector machines

More Related Videos

Methods of Soil Resampling to Monitor Changes in the Chemical Concentrations of Forest Soils

Methods of Soil Resampling to Monitor Changes in the Chemical Concentrations of Forest Soils

Published on: November 25, 2016

Simulating Impacts of Ice Storms on Forest Ecosystems

Simulating Impacts of Ice Storms on Forest Ecosystems

Published on: June 30, 2020

Related Experiment Videos

Last Updated: Feb 4, 2026

Collecting and Processing Drone-based Remotely Sensed Data for Use in Forest Recovery Monitoring

Collecting and Processing Drone-based Remotely Sensed Data for Use in Forest Recovery Monitoring

Published on: October 24, 2025

Methods of Soil Resampling to Monitor Changes in the Chemical Concentrations of Forest Soils

Methods of Soil Resampling to Monitor Changes in the Chemical Concentrations of Forest Soils

Published on: November 25, 2016

Simulating Impacts of Ice Storms on Forest Ecosystems

Simulating Impacts of Ice Storms on Forest Ecosystems

Published on: June 30, 2020

Area of Science:

Computational chemistry
Cheminformatics
Machine learning in drug discovery

Background:

Ligand-based predictive modeling is crucial for drug discovery decision-making.
Increasing dataset sizes necessitate efficient data analysis for rapid and robust modeling.

Purpose of the Study:

To evaluate the efficiency of machine learning methods on sparse data structures.
To compare the impact of Morgan fingerprints (radii, hash sizes) and molecular signatures on modeling time, predictive performance, and memory usage.
To assess Scikit-learn and FEST implementations of random forest, alongside a support vector machine.

Main Methods:

Analysis of four datasets using Morgan fingerprints (varying radii and hash sizes) and molecular signatures.
Comparison of modeling time, predictive performance, and memory requirements.
Utilized Scikit-learn and FEST implementations of random forest, and a support vector machine.

Main Results:

Unhashed fingerprints significantly outperformed hashed fingerprints in accuracy, with comparable modeling time and memory usage.
The FEST algorithm demonstrated fast execution and low memory usage, suitable for large, high-dimensional sparse data.
Support vector machines and random forests performed comparably, with support vector machines better utilizing information from larger Morgan fingerprint radii.

Conclusions:

Unhashed Morgan fingerprints are recommended for improved accuracy in ligand-based predictive modeling.
The FEST algorithm is a viable and efficient option for handling large, sparse chemical datasets.
Both random forest and support vector machines are effective, with nuances in descriptor utilization.