Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Regression Toward the Mean01:52

Regression Toward the Mean

7.2K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
7.2K
Conformity01:20

Conformity

48.2K
Conformity is the change in a person’s behavior to go along with the group, even if that person does not agree with the group.
48.2K
Confidence Coefficient01:24

Confidence Coefficient

10.7K
The confidence coefficient is also known as the confidence level or degree of confidence. It is the percent expression for the probability, 1-α, that the confidence interval contains the true population parameter assuming that the confidence interval is obtained after sufficient unbiased sampling; for example, if the CL = 90%, then in 90 out of 100 samples the interval estimate will enclose the true population parameter. Here α is the area under the curve, distributed equally under...
10.7K
Confidence Intervals01:21

Confidence Intervals

10.8K
An unbiased point estimate is often insufficient to predict a population estimate, such as population mean or population proportion. In this scenario, a confidence interval is used. A confidence interval is an estimate similar to a  sample proportion. However, unlike the point estimate which is a single value, the confidence interval  contains a range of values. These values have lower and upper limits, known as confidence limits, and can be designated as L1 and L2, respectively.
A...
10.8K
Multiple Regression01:25

Multiple Regression

4.0K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
4.0K
Correlation and Regression00:53

Correlation and Regression

3.5K
In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...
3.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Cancer intrinsic protein neddylation modulates the intra-tumoral immune landscape to constrain immune checkpoint blockade therapy.

Cancer immunology research·2026
Same author

Cell painting and thermal proteome profiling for inference of drug targets and mechanism of action.

Molecular systems biology·2026
Same author

Benign-by-design chemistry: Reinventing ligand-based drug design at the edge of AI.

Drug discovery today·2026
Same author

AI agents in drug discovery: applications and case studies.

Drug discovery today·2026
Same author

Counting cells can accurately predict small-molecule bioactivity benchmarks.

Nature communications·2026
Same author

Co-exposure to PFAS and hydroxylated PCBs is associated with increased odds of multiple sclerosis.

Environment international·2025
Same journal

Unified heterogeneity-aware benchmark of drug synergy prediction: a cross-study analysis of traditional machine learning and graph deep learning models.

Journal of cheminformatics·2026
Same journal

Count your bits: fingerprint benchmarking to assess broad chemical space representation.

Journal of cheminformatics·2026
Same journal

Sampling out-of-distribution chemical spaces via Bayesian flow.

Journal of cheminformatics·2026
Same journal

Hold on tight: the kinetic profiling of opioid receptor ligands using the CORAL-MD.

Journal of cheminformatics·2026
Same journal

Transformer-accelerated discovery of inhibitors targeting the RpsA<sub>Δ438</sub> deletion in PZA-resistant tuberculosis.

Journal of cheminformatics·2026
Same journal

DICL: a manually curated database of ion channels and ligands as a useful platform for drug discovery targeting ion channels.

Journal of cheminformatics·2026
See all related articles

Related Experiment Video

Updated: Feb 12, 2026

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines
08:27

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

1.6K

A confidence predictor for logD using conformal regression and a support-vector machine.

Maris Lapins1, Staffan Arvidsson1, Samuel Lampa1

  • 1Department of Pharmaceutical Biosciences, Uppsala University, Box 591, 751 24, Uppsala, Sweden.

Journal of Cheminformatics
|April 5, 2018
PubMed
Summary
This summary is machine-generated.

We developed predictive models for lipophilicity (logD) to aid drug discovery. These models, utilizing 1.6 million compounds, provide reliable predictions and are accessible online.

Keywords:
Conformal predictionLogDMachine learningQSARRDFSupport-vector machine

More Related Videos

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine
07:05

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

9.6K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.9K

Related Experiment Videos

Last Updated: Feb 12, 2026

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines
08:27

Author Spotlight: Efficient Image Recognition Using Directional Gradient Histogram Technique and Support Vector Machines

Published on: January 5, 2024

1.6K
Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine
07:05

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

9.6K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.9K

Area of Science:

  • Computational chemistry
  • Medicinal chemistry
  • Drug discovery

Background:

  • Lipophilicity, quantified by the water-octanol distribution coefficient (logD), is crucial for predicting Absorption, Distribution, Metabolism, Excretion, and Toxicity (ADMET) properties.
  • Accurate prediction of logD is essential for efficient drug candidate selection and optimization in pharmaceutical research.

Purpose of the Study:

  • To develop and validate large-scale predictive models for logD.
  • To provide reliable logD predictions with associated confidence intervals for drug discovery projects.

Main Methods:

  • Utilized ACD/logD data for 1.6 million compounds from the ChEMBL database.
  • Employed a support-vector machine with a linear kernel and conformal prediction methodology.
  • Evaluated model performance using metrics such as predictive ability and median prediction interval width.

Main Results:

  • Achieved a predictive ability of [Formula: see text].
  • The best performing model provided median prediction intervals of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence.
  • Developed an online service with an OpenAPI interface and a web page for predictions.

Conclusions:

  • The developed logD prediction models are accurate and provide valuable confidence intervals.
  • The models and associated data are made available through various online services and downloadable formats to support drug discovery efforts.
  • The large-scale prediction of logD for millions of compounds facilitates broader application in cheminformatics and drug design.