Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and Cox...

Kaplan-Meier Approach

Kaplan-Meier Approach

The Kaplan-Meier estimator is a non-parametric method used to estimate the survival function from time-to-event data. In medical research, it is frequently employed to measure the proportion of patients surviving for a certain period after treatment. This estimator is fundamental in analyzing time-to-event data, making it indispensable in clinical trials, epidemiological studies, and reliability engineering. By estimating survival probabilities, researchers can evaluate treatment effectiveness,...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance, comparing...

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic Models: Compartment Models in Individual and Population Analysis

Mechanistic models are utilized in individual analysis using single-source data, but imperfections arise due to data collection errors, preventing perfect prediction of observed data. The mathematical equation involves known values (Xi), observed concentrations (Ci), measurement errors (εi), model parameters (ϕj), and the related function (ƒi) for i number of values. Different least-squares metrics quantify differences between predicted and observed values. The ordinary least squares (OLS)...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Time to Treatment Initiation and Survival in Patients With Muscle-Invasive Bladder Cancer (MIBC).

Clinical genitourinary cancer·2026

Same author

Long-term Penetrance of Disease Variants in Genes Prioritized for Genomic Newborn Screening: Evidence from Adult Biobanks.

medRxiv : the preprint server for health sciences·2026

Same author

Extracorporeal Membrane Oxygenation in Pediatric Pulmonary Hypertension: A Single-Center Cohort to Inform Decision-Making.

ASAIO journal (American Society for Artificial Internal Organs : 1992)·2026

Same author

Child Opportunity Index Influences Pediatric Pulmonary Hypertension Outcomes: Analyses From the Pediatric Health Information System.

JACC. Advances·2026

Same author

Variability and Hospital-Level Predictors of Magnetic Resonance Imaging for Elevated Prostate Specific Antigen: Results of a Medicare and American Hospital Association Linkage.

Urology practice·2026

Same author

Validation of a Large Language Model Enhanced Frailty Index.

Journal of medical systems·2026

Same journal

Association of revascularization strategy with wound healing following toe amputation in chronic limb-threatening ischemia.

Journal of vascular surgery·2026

Same journal

Premature peripheral arterial disease (PAD) is associated with worse outcomes after endovascular peripheral vascular intervention.

Journal of vascular surgery·2026

Same journal

Self-Expandable versus Balloon-Expandable Bridging Stents in Off-the-Shelf Inner Branch Repair: Midterm Results from the INBREED Registry.

Journal of vascular surgery·2026

Same journal

Procedural outcomes and follow-up of endovascular treatment for extracranial carotid artery aneurysms; a systematic review.

Journal of vascular surgery·2026

Same journal

Impact of Insurance Status on Urgency of Presentation and Perioperative Outcomes Following Endovascular Repair of Abdominal Aortic Aneurysms: A Vascular Quality Initiative Analysis.

Journal of vascular surgery·2026

Same journal

Large language models routinely overcode peripheral endovascular procedures relative to professional coders.

Journal of vascular surgery·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 10, 2026

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

Comparative methods for handling missing data in large databases.

Antonia J Henry¹, Nathanael D Hevelone, Stuart Lipsitz

¹Division of Vascular & Endovascular Surgery, Brigham & Women's Hospital, Harvard Medical School, Boston, Mass; Center for Surgery and Public Health, Brigham & Women's Hospital, Harvard Medical School, Boston, Mass.

Journal of Vascular Surgery

|July 9, 2013

Summary

This summary is machine-generated.

Handling missing race data in health services research is crucial. Reweighted estimating equations offer the least bias, while missing indicator variables introduce significant bias in predicting major amputations for critical limb ischemia patients.

More Related Videos

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

A User-friendly and Powerful R Analysis of Large-scale Datasets

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

Related Experiment Videos

Last Updated: May 10, 2026

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Databases to Efficiently Manage Medium Sized, Low Velocity, Multidimensional Data in Tissue Engineering

Published on: November 22, 2019

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Performing Data Mining And Integrative Analysis Of Biomarker in Breast Cancer Using Multiple Publicly Accessible Databases

Published on: May 17, 2019

A User-friendly and Powerful R Analysis of Large-scale Datasets

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

Area of Science:

Health Services Research
Biostatistics
Epidemiology

Background:

Missing data in complex survey databases presents significant challenges for health services researchers.
Categorical variables, such as race, are particularly susceptible to multifactorial missingness.
Accurate analysis of large datasets requires effective strategies for handling missing data.

Purpose of the Study:

To evaluate the bias introduced by five different methods for handling missing race data.
To compare the performance of these methods in predicting major amputation in patients with critical limb ischemia (CLI).
To provide empirical evidence to guide the selection of appropriate missing data handling techniques.

Main Methods:

Analysis of a complex survey database (Nationwide Inpatient Sample, 2003-2007) with simulated missing race data (5%, 15%, 30%).
Comparison of five methods: complete case analysis, replacement with observed frequencies, missing indicator variable, multiple imputation, and reweighted estimating equations.
Bias estimation by comparing regression coefficients from simulated data sets to those from fully observed data.

Main Results:

Reweighted estimating equations demonstrated the least bias in coefficient estimates.
The missing indicator variable method resulted in the most significant bias.
Complete case analysis, replacement with observed frequencies, and multiple imputation showed moderate levels of bias.

Conclusions:

Missing data handling is a critical consideration in analyzing large health databases.
The missing indicator variable method should be used cautiously due to substantial bias.
Method selection for missing data should be informed by the quantity and nature of the missing data.