Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different reasons...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...

Data Collection by Observations

Data Collection by Observations

Data collection refers to a systematic way of obtaining, observing, measuring, and analyzing accurate information. Observational studies are one of the most widely used methods of data collection. It involves collecting data by observing the behavior and physical characteristics of a sample without making any modifications to the sample.
An astronomer viewing the motion and brightness of stars in the sky and recording the data is an example of observational data collection. A botanist recording...

Truncation in Survival Analysis

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are observed.

Trimmed Mean

Trimmed Mean

While measuring the mean of a data set, care needs to be taken when associating the mean to its central tendency. The same goes for the arithmetic mean, the geometric mean, or the harmonic mean. This is because the presence of a single outlier data value can significantly affect the mean. That is, the mean is sensitive to fluctuations in the data set.
Although certain measures of central tendency are not sensitive to outliers, there are alternative versions of the mean that get around the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cognitive Aging and Brain Health: A Comparison of Super Movers vs Nonsuper Movers.

Neurology·2026

Same author

Evaluating an Abbreviated Version of Mindfulness-Based Cognitive Therapy Delivered via Telephone or Videoconferencing Compared to Enhanced Usual Care-Treatment for Migraine and Mood (TEAM-M) Study: Protocol for a Three-Arm Multisite Randomized Controlled Feasibility Trial.

JMIR research protocols·2026

Same author

Personalized Digital Care Program Allocation for Older Adults: Reinforcement Learning-Based Simulation Study.

JMIR aging·2026

Same author

Generalizability of blood-based biomarkers of Alzheimer's disease and related dementias in a multicultural cohort of older adults: The effect of adjustment for kidney function.

Journal of Alzheimer's disease : JAD·2026

Same author

Functional data analysis of heart rate variability from continuous ECG monitoring in older adults with and without mild cognitive impairment.

Frontiers in aging neuroscience·2026

Same author

Clinical Manifestations.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2025

Same journal

Comparing Adaptive Interventions under a General Sequential Multiple Assignment Randomized Trial Design via Multiple Comparisons with the Best.

Journal of statistical planning and inference·2026

Same journal

Variable Selection in Ultra-high Dimensional Feature Space for the Cox Model with Interval-Censored Data.

Journal of statistical planning and inference·2026

Same journal

On semi-supervised estimation using exponential tilt mixture models.

Journal of statistical planning and inference·2025

Same journal

Regression-Assisted Bayesian Record Linkage for Causal Inference in Observational Studies with Covariates Spread Over Two Files.

Journal of statistical planning and inference·2024

Same journal

Efficient inference of parent-of-origin effect using case-control mother-child genotype data.

Journal of statistical planning and inference·2024

Same journal

Distributed eQTL analysis with auxiliary information.

Journal of statistical planning and inference·2024

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 16, 2026

Using Continuous Data Tracking Technology to Study Exercise Adherence in Pulmonary Rehabilitation

Using Continuous Data Tracking Technology to Study Exercise Adherence in Pulmonary Rehabilitation

Published on: November 8, 2013

HANDLING MISSING DATA BY DELETING COMPLETELY OBSERVED RECORDS.

Myunghee Cho Paik¹, Cuiling Wang

¹Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168 Street, New York City, N.Y. 10032, U.S.A.

Journal of Statistical Planning and Inference

|February 18, 2010

Summary

This summary is machine-generated.

Analyzing incomplete data can lead to bias. This study introduces novel estimators for missing data that improve efficiency and reduce bias compared to existing methods, especially with large missingness proportions.

More Related Videos

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Related Experiment Videos

Last Updated: Jun 16, 2026

Using Continuous Data Tracking Technology to Study Exercise Adherence in Pulmonary Rehabilitation

Using Continuous Data Tracking Technology to Study Exercise Adherence in Pulmonary Rehabilitation

Published on: November 8, 2013

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Area of Science:

Statistics
Biostatistics
Epidemiology

Background:

Missing data in statistical analysis can introduce bias and inefficiency.
Current methods like likelihood, imputation, and inverse probability weighting have limitations.
Analyzing only completely observed data can lead to suboptimal results.

Purpose of the Study:

To propose novel estimators for handling missing data in regression settings.
To develop methods that are more efficient and stable than existing inverse probability weighting.
To provide estimators with smaller asymptotic variances than using only complete cases.

Main Methods:

Generating artificial observation indicators independent of the outcome.
Developing a stable weighting method based on artificial indicators.
Enhancing weighting estimator efficiency by projecting onto the nuisance tangent space.

Main Results:

Proposed estimators demonstrate asymptotic variances less than or equal to complete case analysis.
The novel weighting method offers more stable weights than traditional inverse probability weighting.
Simulation studies confirm superior efficiency of proposed estimators, particularly with high missingness.

Conclusions:

The proposed estimators effectively address missing data challenges in regression.
These methods offer improved statistical efficiency and reduced bias.
The novel approaches are particularly beneficial when dealing with substantial amounts of missing data.