Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Regression Toward the Mean01:52

Regression Toward the Mean

7.1K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
7.1K
Drug Discovery: Overview01:26

Drug Discovery: Overview

11.7K
Drug discovery is a multifaceted process involving extensive screening, testing, and optimization of lead compounds to identify potential new drugs for therapeutic use. It combines several approaches, including screening large numbers of natural products, chemical modification of known active molecules, identification of new drug targets, and rational design based on biological mechanisms and drug-receptor structure. These approaches are carried out in both academic research laboratories and...
11.7K
Multiple Regression01:25

Multiple Regression

4.0K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
4.0K
Correlation and Regression00:53

Correlation and Regression

3.5K
In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...
3.5K
Regression Analysis01:11

Regression Analysis

8.4K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
8.4K
Microsoft Excel: Regression Analysis01:18

Microsoft Excel: Regression Analysis

1.6K
Regression analysis in Microsoft Excel is a powerful statistical method for examining the relationship between a dependent variable and one or more independent variables. It's used extensively in fields such as economics, biology, and business to predict outcomes, understand relationships, and make data-driven decisions. The most common type is linear regression, which attempts to fit a straight line through the data points to model the relationship between variables.
To perform regression...
1.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical Matter─Compounds with Novel Mechanisms from High-Throughput Screening Profiles.

ACS chemical biology·2024
Same author

Compound Activity Prediction with Dose-Dependent Transcriptomic Profiles and Deep Learning.

Journal of chemical information and modeling·2024
Same author

Clathrin light chain A-enriched small extracellular vesicles remodel microvascular niche to induce hepatocellular carcinoma metastasis.

Journal of extracellular vesicles·2023
Same author

Clathrin light chain A facilitates small extracellular vesicle uptake to promote hepatocellular carcinoma progression.

Hepatology international·2023
Same author

Step-by-Step Electrocrystallization Processes to Make Multiblock Magnetic Molecular Heterostructures.

Journal of the American Chemical Society·2023
Same author

Dihydroartemisinin engages liver fatty acid binding protein and suppresses metastatic hepatocellular carcinoma growth.

Chemical communications (Cambridge, England)·2023
Same journal

Coumarin-based small molecules for diabetes management: rational design, computational studies, synthesis, and biological evaluation.

Journal of computer-aided molecular design·2026
Same journal

Quantitative structure-activity relationship characterization and modeling of length-varying bioactive peptides.

Journal of computer-aided molecular design·2026
Same journal

Prot-ΔΔG: Prediction of protein-protein binding affinity changes upon mutations with pre-training strategies.

Journal of computer-aided molecular design·2026
Same journal

Computational exploration and in vitro validation of anti-quorum sensing potential of phytochemicals from Coleus amboinicus.

Journal of computer-aided molecular design·2026
Same journal

Exploring the HNE-inhibitory potential of in silico selected Moringa oleifera defense proteins.

Journal of computer-aided molecular design·2026
Same journal

Exploring Bakuchiol as an HSP90-Targeting Lead Against Triple-Negative Breast Cancer: Evidence from In Silico, In Vitro, and Synergy Studies.

Journal of computer-aided molecular design·2026
See all related articles
  1. Home
  2. Comparing Massively-multitask Regression Algorithms For Drug Discovery.
  1. Home
  2. Comparing Massively-multitask Regression Algorithms For Drug Discovery.

Related Experiment Video

Facilitating Drug Discovery: An Automated High-content Inflammation Assay in Zebrafish
07:50

Facilitating Drug Discovery: An Automated High-content Inflammation Assay in Zebrafish

Published on: July 16, 2012

14.8K

Comparing massively-multitask regression algorithms for drug discovery.

Eric J Martin1, Xiang-Wei Zhu2, Patrick Riley3

  • 1Novartis Biomedical Research, Emeryville, CA, 94608, USA. eric.martin@novartis.com.

Journal of Computer-Aided Molecular Design
|February 4, 2026

View abstract on PubMed

Summary
This summary is machine-generated.

Massively-multitask regression models (MMRMs) significantly improve drug discovery activity prediction compared to single-task models. However, their performance is overestimated when using large test sets, highlighting the importance of realistic data splits for accurate evaluation.

Keywords:
Algorithm comparisonDrug discoveryImputationMultitask regressionQSARVirtual screening

More Related Videos

Incorporating Target Protein Structure Flexibility and Dynamics in Computational Drug Discovery Using Ensemble-Based Docking Analysis
08:49

Incorporating Target Protein Structure Flexibility and Dynamics in Computational Drug Discovery Using Ensemble-Based Docking Analysis

Published on: June 20, 2025

1.3K
Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking
05:58

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

9.3K

Related Experiment Videos

Facilitating Drug Discovery: An Automated High-content Inflammation Assay in Zebrafish
07:50

Facilitating Drug Discovery: An Automated High-content Inflammation Assay in Zebrafish

Published on: July 16, 2012

14.8K
Incorporating Target Protein Structure Flexibility and Dynamics in Computational Drug Discovery Using Ensemble-Based Docking Analysis
08:49

Incorporating Target Protein Structure Flexibility and Dynamics in Computational Drug Discovery Using Ensemble-Based Docking Analysis

Published on: June 20, 2025

1.3K
Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking
05:58

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

9.3K

Area of Science:

  • Computational Chemistry
  • Cheminformatics
  • Machine Learning in Drug Discovery

Background:

  • Massively-multitask regression models (MMRMs) have emerged as powerful tools for predicting compound bioactivity in drug discovery.
  • These models, trained on extensive datasets, offer accuracy comparable to experimental measurements.

Purpose of the Study:

  • To compare the performance of six leading MMRMs (pQSAR, Alchemite, MT-DNN, MetaNN, Macau, IMC) for bioactivity profile imputation.
  • To evaluate the impact of different training/test set splits on MMRM performance and accuracy estimation.

Main Methods:

  • Six MMRMs were trained by experts on identical datasets comprising 159 kinase and 4276 ChEMBL assays.
  • Models were evaluated using both 75/25 and 99+/ <1% training/test set splits to assess performance under varying data availability scenarios.
  • Comparative analysis included qualitative assessment and statistical rigor, benchmarking against single-task random forest regression (ST-RFR).
  • Main Results:

    • MMRMs significantly outperformed the ST-RFR model in bioactivity profile imputation.
    • Performance varied considerably based on the training/test split; 75/25 splits led to a substantial underestimation of model accuracy compared to 99+/ <1% splits.
    • While MMRMs excel at imputing profiles within the training data distribution, their advantage diminishes for compounds dissimilar to the training set.

    Conclusions:

    • MMRMs are highly effective for tasks like hit-finding, off-target prediction, and drug repurposing within the chemical space of the training data.
    • The choice of data splitting strategy critically impacts the perceived accuracy of MMRMs, necessitating the use of realistic, smaller test sets for reliable evaluation.
    • MMRMs' utility is greatest for exploring known chemical spaces, while their performance on novel chemical entities requires further investigation.