Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Predicting Molecular Geometry

Predicting Molecular Geometry

VSEPR Theory for Determination of Electron Pair Geometries

Statistical Methods for Analyzing Epidemiological Data

Statistical Methods for Analyzing Epidemiological Data

Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Statistical Methods to Analyze Parametric Data: ANOVA

Statistical Methods to Analyze Parametric Data: ANOVA

Analysis of Variance, or ANOVA, is a powerful statistical technique used to analyze parametric data, primarily in research and experimental studies. It's designed to compare the means of two or more groups, assisting researchers in identifying any significant differences between these group means. There are two main types of ANOVA based on the complexity of the analysis: one-way and two-way.
One-way ANOVA is applied when a single independent variable or factor is scrutinized. It compares...

Sensitivity, Specificity, and Predicted Value

Sensitivity, Specificity, and Predicted Value

In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...

End Point Prediction: Gran Plot

End Point Prediction: Gran Plot

A Gran plot is used to predict the equivalence volume or endpoint of a potentiometric or acid-base titration without reaching the endpoint. Typically, titration data is collected as a function of the titrant's volume up to a point less than the equivalence volume and then transformed into a linear format. The straight line is extended to the x-axis, indicating the necessary titrant volume to achieve the equivalence point.
For potentiometric titration, the Gran plot is created by plotting...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Synthesis of Phenyl 2-Acetamidoselenogalactoside Mimetics and Interaction with Amyloid β<sub>1-42</sub>.

Pharmaceuticals (Basel, Switzerland)·2026

Same author

Explicit Applicability Domain Calculations Can Help Determine When Uncertainty Estimates Are Less Reliable.

ACS omega·2026

Same author

Development and application of an ion sensitive probe for helicon plasma diagnostics.

The Review of scientific instruments·2025

Same author

Synthesis, crystal structure and Hirshfeld surface analysis of Fmoc-β-amino butyric acid and Fmoc carbamate.

Acta crystallographica. Section E, Crystallographic communications·2025

Same author

Visualising lead optimisation series using reduced graphs.

Journal of cheminformatics·2025

Same author

Hierarchically spherical assembly of carbon nanorods derived from metal-organic framework as solid-phase microextraction coating for nitrated polycyclic aromatic hydrocarbon analysis.

Journal of chromatography. A·2024

Same journal

Multimodal feature fusion for molecular property classification.

Journal of cheminformatics·2026

Same journal

P2MAT: A machine learning (ML) driven software for Property Prediction of MATerial.

Journal of cheminformatics·2026

Same journal

Computational design of low-volatility lubricants for space using interpretable machine learning.

Journal of cheminformatics·2026

Same journal

OpenStats: how to combine statistics and research data management (RDM) to leverage efficient scientific data analysis by guided statistics.

Journal of cheminformatics·2026

Same journal

Unified heterogeneity-aware benchmark of drug synergy prediction: a cross-study analysis of traditional machine learning and graph deep learning models.

Journal of cheminformatics·2026

Same journal

Count your bits: fingerprint benchmarking to assess broad chemical space representation.

Journal of cheminformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 10, 2026

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

Effect of missing data on multitask prediction methods.

Antonio de la Vega de León¹, Beining Chen², Valerie J Gillet³

¹Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK. a.vega@sheffield.ac.uk.

Journal of Cheminformatics

|May 24, 2018

Summary

This summary is machine-generated.

Multitask prediction in chemoinformatics faces sparse data challenges. Performance decreases slowly with missing data initially, then rapidly as more data is removed, impacting deep neural networks and Bayesian methods.

Keywords:

Deep neural networks Macau Missing data Multitask prediction Sparse data sets

More Related Videos

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

Vision Training Methods for Sports Concussion Mitigation and Management

Vision Training Methods for Sports Concussion Mitigation and Management

Published on: May 5, 2015

Related Experiment Videos

Last Updated: Feb 10, 2026

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

Vision Training Methods for Sports Concussion Mitigation and Management

Vision Training Methods for Sports Concussion Mitigation and Management

Published on: May 5, 2015

Area of Science:

Chemoinformatics
Computational Chemistry
Machine Learning

Background:

Multitask prediction is crucial in chemoinformatics for profiling compound activities.
Deep neural networks (DNNs) are increasingly used for multitask prediction.
Multitarget datasets are often sparse, posing challenges for model training.

Purpose of the Study:

To investigate the impact of data sparsity on multitask prediction performance.
To compare the effects of missing data on DNNs and Bayesian probabilistic matrix factorization (Macau).

Main Methods:

Simulated data sparsity by removing data from complete datasets.
Trained DNNs and Macau models on these sparse datasets.
Compared model performance across varying levels of data removal.

Main Results:

Both DNNs and Macau showed similar performance degradation with increasing data sparsity.
Performance decrease was gradual initially, then accelerated significantly after substantial data removal.
Identified a critical threshold for data requirements in multitask prediction.

Conclusions:

Data sparsity significantly affects multitask prediction model performance.
Understanding data requirements is essential for reliable biological activity profiling.
This study offers insights into data sufficiency for effective chemoinformatics models.