Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.4K
Censoring Survival Data01:09

Censoring Survival Data

228
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
228
Survival Tree01:19

Survival Tree

159
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
159
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.0K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.0K
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

207
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
207
Prediction Intervals01:03

Prediction Intervals

2.3K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Causal information changes how we reason: a mixed-methods analysis of decision-making with causal information.

Frontiers in cognition·2026
Same author

Artificial Intelligence and Machine Learning Resource Guide: The Academy of Nutrition and Dietetics and the American Society for Nutrition Joint Taskforce for Artificial Intelligence.

The American journal of clinical nutrition·2026
Same author

Artificial Intelligence and Machine Learning Resource Guide: The Academy of Nutrition and Dietetics and the American Society for Nutrition Joint Taskforce for Artificial Intelligence.

Journal of the Academy of Nutrition and Dietetics·2026
Same author

Evaluating Causal and Noncausal Text Messages to Promote Physical Activity in Adults: Randomized Pilot Study.

JMIR formative research·2025
Same author

Estimating days needed for dietary assessment in pregnancy: a modeling study.

The American journal of clinical nutrition·2025
Same author

Integrative Genomic and Immune Profiling to Identify and Characterize High-Risk Subgroups in Acute Myeloid Leukemia: Development of a 20-Gene Predictive Signature and Its Clinical Implications.

Omics : a journal of integrative biology·2025
Same journal

Towards the Efficient Inference by Incorporating Automated Computational Phenotypes under Covariate Shift.

Proceedings of machine learning research·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Sep 9, 2025

Author Spotlight: Alignment of Synchronized Time-Series Data Using the Characterizing Loss of Cell Cycle Synchrony Model for Cross-Experiment Comparisons
07:59

Author Spotlight: Alignment of Synchronized Time-Series Data Using the Characterizing Loss of Cell Cycle Synchrony Model for Cross-Experiment Comparisons

Published on: June 9, 2023

1.5K

Benchmarking Missing Data Imputation Methods for Time Series Using Real-World Test Cases.

Adedolapo Aishat Toye1, Asuman Celik1, Samantha Kleinberg1

  • 1Department of Computer Science, Stevens Institute of Technology, USA.

Proceedings of Machine Learning Research
|September 2, 2025
PubMed
Summary
This summary is machine-generated.

Healthcare imputation methods perform best on random missing data, not realistic patterns. Linear interpolation showed the lowest error across all missing data types, highlighting a need for better evaluation and imputation techniques for complex missingness.

More Related Videos

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

10.8K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Related Experiment Videos

Last Updated: Sep 9, 2025

Author Spotlight: Alignment of Synchronized Time-Series Data Using the Characterizing Loss of Cell Cycle Synchrony Model for Cross-Experiment Comparisons
07:59

Author Spotlight: Alignment of Synchronized Time-Series Data Using the Characterizing Loss of Cell Cycle Synchrony Model for Cross-Experiment Comparisons

Published on: June 9, 2023

1.5K
A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

10.8K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Area of Science:

  • Healthcare data science
  • Biostatistics
  • Machine learning in medicine

Background:

  • Missing data is a significant challenge in healthcare analytics.
  • Current imputation methods are often evaluated on unrealistic missing data patterns.
  • Real-world missingness mechanisms (MCAR, MAR, NMAR) require robust imputation strategies.

Purpose of the Study:

  • To assess the real-world accuracy of 12 imputation methods across three missing data mechanisms (MCAR, MAR, NMAR).
  • To compare imputation performance on continuous glucose monitoring and heart rate time series data.
  • To evaluate the impact of missingness percentages (5-30%) on imputation accuracy.

Main Methods:

  • Simulated missingness in Loop (CGM) and All of Us (heart rate) datasets according to MCAR, MAR, and NMAR mechanisms.
  • Tested 12 state-of-the-art and commonly used imputation methods.
  • Evaluated accuracy using root mean square error (RMSE) and bias metrics across demographic groups.

Main Results:

  • Imputation accuracy was significantly higher for missing completely at random (MCAR) data compared to missing at random (MAR) and not missing at random (NMAR) data.
  • Linear interpolation demonstrated the lowest RMSE and minimal bias across all tested mechanisms and demographic groups.
  • Existing evaluation practices may overestimate imputation method performance in real-world scenarios.

Conclusions:

  • Current imputation method evaluations do not reflect real-world performance with realistic missing data patterns.
  • Linear interpolation offers a reliable baseline for imputation, even with complex missingness.
  • Further research should focus on developing improved evaluation methodologies and imputation techniques tailored to real-world missing data mechanisms.