Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Predicting Molecular Geometry02:27

Predicting Molecular Geometry

46.1K
VSEPR Theory for Determination of Electron Pair Geometries
46.1K
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

986
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
986
Prediction Intervals01:03

Prediction Intervals

3.4K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
3.4K
Statistical Methods to Analyze Parametric Data: ANOVA01:12

Statistical Methods to Analyze Parametric Data: ANOVA

1.7K
Analysis of Variance, or ANOVA, is a powerful statistical technique used to analyze parametric data, primarily in research and experimental studies. It's designed to compare the means of two or more groups, assisting researchers in identifying any significant differences between these group means. There are two main types of ANOVA based on the complexity of the analysis: one-way and two-way.
One-way ANOVA is applied when a single independent variable or factor is scrutinized. It compares...
1.7K
Sensitivity, Specificity, and Predicted Value01:13

Sensitivity, Specificity, and Predicted Value

1.4K
In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...
1.4K
End Point Prediction: Gran Plot01:07

End Point Prediction: Gran Plot

1.2K
A Gran plot is used to predict the equivalence volume or endpoint of a potentiometric or acid-base titration without reaching the endpoint. Typically, titration data is collected as a function of the titrant's volume up to a point less than the equivalence volume and then transformed into a linear format. The straight line is extended to the x-axis, indicating the necessary titrant volume to achieve the equivalence point.
For potentiometric titration, the Gran plot is created by plotting...
1.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Synthesis of Phenyl 2-Acetamidoselenogalactoside Mimetics and Interaction with Amyloid β<sub>1-42</sub>.

Pharmaceuticals (Basel, Switzerland)·2026
Same author

Explicit Applicability Domain Calculations Can Help Determine When Uncertainty Estimates Are Less Reliable.

ACS omega·2026
Same author

Development and application of an ion sensitive probe for helicon plasma diagnostics.

The Review of scientific instruments·2025
Same author

Synthesis, crystal structure and Hirshfeld surface analysis of Fmoc-β-amino butyric acid and Fmoc carbamate.

Acta crystallographica. Section E, Crystallographic communications·2025
Same author

Visualising lead optimisation series using reduced graphs.

Journal of cheminformatics·2025
Same author

Hierarchically spherical assembly of carbon nanorods derived from metal-organic framework as solid-phase microextraction coating for nitrated polycyclic aromatic hydrocarbon analysis.

Journal of chromatography. A·2024
Same journal

Multimodal feature fusion for molecular property classification.

Journal of cheminformatics·2026
Same journal

P2MAT: A machine learning (ML) driven software for Property Prediction of MATerial.

Journal of cheminformatics·2026
Same journal

Computational design of low-volatility lubricants for space using interpretable machine learning.

Journal of cheminformatics·2026
Same journal

OpenStats: how to combine statistics and research data management (RDM) to leverage efficient scientific data analysis by guided statistics.

Journal of cheminformatics·2026
Same journal

Unified heterogeneity-aware benchmark of drug synergy prediction: a cross-study analysis of traditional machine learning and graph deep learning models.

Journal of cheminformatics·2026
Same journal

Count your bits: fingerprint benchmarking to assess broad chemical space representation.

Journal of cheminformatics·2026
See all related articles

Related Experiment Video

Updated: Feb 10, 2026

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking
05:58

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

9.3K

Effect of missing data on multitask prediction methods.

Antonio de la Vega de León1, Beining Chen2, Valerie J Gillet3

  • 1Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK. a.vega@sheffield.ac.uk.

Journal of Cheminformatics
|May 24, 2018
PubMed
Summary
This summary is machine-generated.

Multitask prediction in chemoinformatics faces sparse data challenges. Performance decreases slowly with missing data initially, then rapidly as more data is removed, impacting deep neural networks and Bayesian methods.

Keywords:
Deep neural networksMacauMissing dataMultitask predictionSparse data sets

More Related Videos

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

11.1K
Vision Training Methods for Sports Concussion Mitigation and Management
12:54

Vision Training Methods for Sports Concussion Mitigation and Management

Published on: May 5, 2015

18.1K

Related Experiment Videos

Last Updated: Feb 10, 2026

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking
05:58

Using Rapid Serial Visual Presentation to Measure Set-Specific Capture, a Consequence of Distraction While Multitasking

Published on: August 29, 2018

9.3K
A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

11.1K
Vision Training Methods for Sports Concussion Mitigation and Management
12:54

Vision Training Methods for Sports Concussion Mitigation and Management

Published on: May 5, 2015

18.1K

Area of Science:

  • Chemoinformatics
  • Computational Chemistry
  • Machine Learning

Background:

  • Multitask prediction is crucial in chemoinformatics for profiling compound activities.
  • Deep neural networks (DNNs) are increasingly used for multitask prediction.
  • Multitarget datasets are often sparse, posing challenges for model training.

Purpose of the Study:

  • To investigate the impact of data sparsity on multitask prediction performance.
  • To compare the effects of missing data on DNNs and Bayesian probabilistic matrix factorization (Macau).

Main Methods:

  • Simulated data sparsity by removing data from complete datasets.
  • Trained DNNs and Macau models on these sparse datasets.
  • Compared model performance across varying levels of data removal.

Main Results:

  • Both DNNs and Macau showed similar performance degradation with increasing data sparsity.
  • Performance decrease was gradual initially, then accelerated significantly after substantial data removal.
  • Identified a critical threshold for data requirements in multitask prediction.

Conclusions:

  • Data sparsity significantly affects multitask prediction model performance.
  • Understanding data requirements is essential for reliable biological activity profiling.
  • This study offers insights into data sufficiency for effective chemoinformatics models.