Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Conformity

Conformity

Conformity is the change in a person’s behavior to go along with the group, even if that person does not agree with the group.

Confidence Coefficient

Confidence Coefficient

The confidence coefficient is also known as the confidence level or degree of confidence. It is the percent expression for the probability, 1-α, that the confidence interval contains the true population parameter assuming that the confidence interval is obtained after sufficient unbiased sampling; for example, if the CL = 90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter. Here α is the area under the curve, distributed equally under...

Confidence Intervals

Confidence Intervals

An unbiased point estimate is often insufficient to predict a population estimate, such as population mean or population proportion. In this scenario, a confidence interval is used. A confidence interval is an estimate similar to a sample proportion. However, unlike the point estimate which is a single value, the confidence interval contains a range of values. These values have lower and upper limits, known as confidence limits, and can be designated as L1 and L2, respectively.
A...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Correlation and Regression

Correlation and Regression

In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cancer intrinsic protein neddylation modulates the intra-tumoral immune landscape to constrain immune checkpoint blockade therapy.

Cancer immunology research·2026

Same author

Cell painting and thermal proteome profiling for inference of drug targets and mechanism of action.

Molecular systems biology·2026

Same author

Benign-by-design chemistry: Reinventing ligand-based drug design at the edge of AI.

Drug discovery today·2026

Same author

AI agents in drug discovery: applications and case studies.

Drug discovery today·2026

Same author

Counting cells can accurately predict small-molecule bioactivity benchmarks.

Nature communications·2026

Same author

Co-exposure to PFAS and hydroxylated PCBs is associated with increased odds of multiple sclerosis.

Environment international·2025

Same journal

Unified heterogeneity-aware benchmark of drug synergy prediction: a cross-study analysis of traditional machine learning and graph deep learning models.

Journal of cheminformatics·2026

Same journal

Count your bits: fingerprint benchmarking to assess broad chemical space representation.

Journal of cheminformatics·2026

Same journal

Sampling out-of-distribution chemical spaces via Bayesian flow.

Journal of cheminformatics·2026

Same journal

Hold on tight: the kinetic profiling of opioid receptor ligands using the CORAL-MD.

Journal of cheminformatics·2026

Same journal

Transformer-accelerated discovery of inhibitors targeting the RpsA<sub>Δ438</sub> deletion in PZA-resistant tuberculosis.

Journal of cheminformatics·2026

Same journal

DICL: a manually curated database of ion channels and ligands as a useful platform for drug discovery targeting ion channels.

Journal of cheminformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 12, 2026

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

A confidence predictor for logD using conformal regression and a support-vector machine.

Maris Lapins¹, Staffan Arvidsson¹, Samuel Lampa¹

¹Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden.

Journal of Cheminformatics

|April 5, 2018

Summary

This summary is machine-generated.

We developed predictive models for lipophilicity (logD) to aid drug discovery. These models, utilizing 1.6 million compounds, provide reliable predictions and are accessible online.

Keywords:

Conformal prediction LogD Machine learning QSAR RDF Support-vector machine

More Related Videos

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

Related Experiment Videos

Last Updated: Feb 12, 2026

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

Area of Science:

Computational chemistry
Medicinal chemistry
Drug discovery

Background:

Lipophilicity, quantified by the water-octanol distribution coefficient (logD), is crucial for predicting Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) properties.
Accurate prediction of logD is essential for efficient drug candidate selection and optimization in pharmaceutical research.

Purpose of the Study:

To develop and validate large-scale predictive models for logD.
To provide reliable logD predictions with associated confidence intervals for drug discovery projects.

Main Methods:

Utilized ACD/logD data for 1.6 million compounds from the ChEMBL database.
Employed a support-vector machine with a linear kernel and conformal prediction methodology.
Evaluated model performance using metrics such as predictive ability and median prediction interval width.

Main Results:

Achieved a predictive ability of [Formula: see text].
The best performing model provided median prediction intervals of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence.
Developed an online service with an OpenAPI interface and a web page for predictions.

Conclusions:

The developed logD prediction models are accurate and provide valuable confidence intervals.
The models and associated data are made available through various online services and downloadable formats to support drug discovery efforts.
The large-scale prediction of logD for millions of compounds facilitates broader application in cheminformatics and drug design.