Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Bootstrapping01:24

Bootstrapping

659
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
659
Convenience Sampling Method00:55

Convenience Sampling Method

9.3K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population.
Convenience sampling is a non-random method of sample selection; this method selects individuals that are easily accessible and may result in biased data. For example, a marketing...
9.3K
Random Sampling Method01:09

Random Sampling Method

11.9K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest. Among the various sampling methods used by...
11.9K
Mechanistic Models: Compartment Models in Individual and Population Analysis01:23

Mechanistic Models: Compartment Models in Individual and Population Analysis

79
Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least...
79
Kaplan-Meier Approach01:24

Kaplan-Meier Approach

230
The Kaplan-Meier estimator is a non-parametric method used to estimate the survival function from time-to-event data. In medical research, it is frequently employed to measure the proportion of patients surviving for a certain period after treatment. This estimator is fundamental in analyzing time-to-event data, making it indispensable in clinical trials, epidemiological studies, and reliability engineering. By estimating survival probabilities, researchers can evaluate treatment effectiveness,...
230
Sampling Methods: Overview01:06

Sampling Methods: Overview

470
A sample refers to a smaller subset representative of a larger population. In analytical chemistry, studying or analyzing an entire population is often impractical or impossible. Therefore, samples are used to draw inferences and generalize the whole population. The sampling method selects individuals or items from a population to create a sample. Standard sampling methods include random, judgemental, systematic, stratified, and cluster sampling. 
In analytical chemistry, the choice of...
470

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Correlation Is Not Prediction: Reassessing Predictive MRI Evidence in Guidelines for Persons With Relapsing-Remitting Multiple Sclerosis.

Journal of central nervous system disease·2026
Same author

Data Quality in the ProVal-MS Study: Challenges and Lessons Learned.

Studies in health technology and informatics·2026
Same author

Automating the Integration of Longitudinal Clinical Trial Data Using REDCap.

Studies in health technology and informatics·2026
Same author

From Excel to Automation: The Transplant Mapper for Interoperable Transplant Data Management.

Studies in health technology and informatics·2026
Same author

From Clinical Laboratory Results to REDCap: An Automated Workflow for Longitudinal CAR-T Research Data.

Studies in health technology and informatics·2026
Same author

Cost-effectiveness of prehabilitation for elderly (pre-)frail patients prior to elective surgery compared to standard care - an economic evaluation from a societal perspective.

BMC medicine·2026
Same journal

Accounting for approximation errors using surrogate-based parameter estimation of cardiac mechanics digital twins.

Computer methods and programs in biomedicine·2026
Same journal

Facial iPPG heatmap patterns based on period-aware autoencoder show association with carotid atherosclerosis towards non-contact hemodynamic assessment.

Computer methods and programs in biomedicine·2026
Same journal

Explainable machine learning models predict liver fibrosis risk and outcome in the general population: Development and multi-cohort external validation.

Computer methods and programs in biomedicine·2026
Same journal

Evaluation of surrogate endpoints for survival outcomes using the surrogate package in R.

Computer methods and programs in biomedicine·2026
Same journal

Relative spectral and frication-based descriptors as numerical indicators of place of articulation shifts in fricatives produced by Polish children.

Computer methods and programs in biomedicine·2026
Same journal

Leaflet resection improves valve expansion and hemodynamic performance in redo TAVI with balloon- and self-expanding transcatheter heart valve configurations.

Computer methods and programs in biomedicine·2026
See all related articles

Related Experiment Video

Updated: Aug 24, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K

Missing data imputation using utility-based regression and sampling approaches.

Halimu N Haliduola1, Frank Bretz2, Ulrich Mansmann1

  • 1Institute for Medical Information Processing, Biometry and Epidemiology - IBE, LMU Munich, Munich, Germany.

Computer Methods and Programs in Biomedicine
|October 19, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces a novel hybrid approach combining utility-based regression and oversampling to address missing data in clinical trials. The method effectively reduces bias and improves predictions in missing not at random scenarios.

Keywords:
Machine learningMissing dataSMOTERUtility-based regression

More Related Videos

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.6K
Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation
08:47

Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation

Published on: February 9, 2024

1.6K

Related Experiment Videos

Last Updated: Aug 24, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.4K
Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.6K
Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation
08:47

Author Spotlight: UAV Remote Sensing for Efficient Invasive Plant Biomass Estimation

Published on: February 9, 2024

1.6K

Area of Science:

  • Statistics
  • Machine Learning
  • Clinical Trials

Background:

  • Missing data, particularly missing not at random (MNAR), poses a significant challenge in scientific experiments and clinical trials.
  • Standard regression error measures are inadequate for imbalanced learning problems common with MNAR data, especially in clinical settings with extreme values.
  • Existing methods like random forests and multiple imputation can introduce systematic bias, underestimating key statistical measures when data is MNAR.

Purpose of the Study:

  • To develop and evaluate a hybrid imbalanced learning approach for handling MNAR data in cross-sectional clinical trial settings.
  • To address the limitations of standard predictive error measures in regression for imbalanced datasets.
  • To mitigate the systematic bias observed in conventional methods when dealing with MNAR data.

Main Methods:

  • Investigated hybrid imbalanced learning combining utility-based regression (UBR) with synthetic minority oversampling technique for regression (SMOTER).
  • UBR was employed to optimize the product of conditional probability density (estimated via quantile regression forests) and a utility function.
  • SMOTER was utilized to oversample relevant rare cases, enhancing the model's ability to handle imbalanced data.

Main Results:

  • Simulations demonstrated that the proposed hybrid method yields plausible predictions and significantly reduces bias in realistic MNAR data scenarios.
  • Compared to standard approaches (random forests, multiple imputation), the proposed method showed superior performance in mitigating systematic bias.
  • Application to an antidepressant clinical trial dataset confirmed the systematic bias in conventional methods and the effectiveness of the proposed approach.

Conclusions:

  • The proposed hybrid imbalanced learning strategy effectively handles missing not at random data in clinical trials.
  • Utility-based learning offers a promising avenue for improving the analysis of clinical trial data with missing values.
  • Integration of utility-based learning strategies is encouraged for more accurate and less biased analyses in clinical research.