Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Assumptions of Survival Analysis

Assumptions of Survival Analysis

Survival models analyze the time until one or more events occur, such as death in biological organisms or failure in mechanical systems. These models are widely used across fields like medicine, biology, engineering, and public health to study time-to-event phenomena. To ensure accurate results, survival analysis relies on key assumptions and careful study design.

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Confounding in Epidemiological Studies

Confounding in Epidemiological Studies

Confounding in statistical epidemiology represents a pivotal challenge, referring to the distortion in the perceived relationship between an exposure and an outcome due to the presence of a third variable, known as a confounder. This variable is associated with both the exposure and the outcome but is not a direct link in their causal chain. Its presence can lead to erroneous interpretations of the exposure's effect, either exaggerating or underestimating the true association. This...

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

Truncation in Survival Analysis

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...

Introduction To Survival Analysis

Introduction To Survival Analysis

Survival analysis is a statistical method used to study time-to-event data, where the "event" might represent outcomes like death, disease relapse, system failure, or recovery. A unique feature of survival data is censoring, which occurs when the event of interest has not been observed for some individuals during the study period. This requires specialized techniques to handle incomplete data effectively.
The primary goal of survival analysis is to estimate survival time—the time...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Episodic memory trajectories of older adults with and without HIV: A longitudinal population-based study in rural South Africa.

PLOS global public health·2026

Same author

Age-specific disparities in rural and urban survival among patients with IDH-wildtype glioblastoma: a population-based study.

Cancer causes & control : CCC·2026

Same author

EFFICIENT AND MULTIPLY ROBUST RISK ESTIMATION UNDER GENERAL FORMS OF DATASET SHIFT.

Annals of statistics·2026

Same author

Ultrasensitive and multi-analyte detection of catecholamines in serum and cerebrospinal fluid using carboxyl-modified magnetic microspheres coupled with LC/MS.

Journal of chromatography. A·2026

Same author

Unpacking sources of transmission in HIV prevention trials with deep-sequence pathogen data.

Nature communications·2026

Same author

Endogenous H₂S promotes HSPA8 sulfhydration to downregulate HIF1α and prevent ferroptosis in septic myocardial injury.

Redox report : communications in free radical research·2026

Same journal

A SEQUENTIAL SIGNIFICANCE TEST FOR TREATMENT BY COVARIATE INTERACTIONS.

Statistica Sinica·2026

Same journal

DEFINING AND ESTIMATING PRINCIPAL STRATUM SPECIFIC NATURAL MEDIATION EFFECTS WITH SEMI-COMPETING RISKS DATA.

Statistica Sinica·2026

Same journal

Longitudinal Modeling of Rank-based Global Outcome.

Statistica Sinica·2026

Same journal

INTEGRATING INCOMPLETE DATA FOR MEDIATION ANALYSIS.

Statistica Sinica·2026

Same journal

COMMUNITY EXTRACTION OF NETWORK DATA UNDER STOCHASTIC BLOCK MODELS.

Statistica Sinica·2026

Same journal

STATISTICAL INFERENCE FOR MEAN FUNCTIONS OF COMPLEX 3D OBJECTS.

Statistica Sinica·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Nov 25, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Identification and inference with nonignorable missing covariate data.

Wang Miao¹, Eric Tchetgen Tchetgen¹

¹Peking University and Harvard University.

Statistica Sinica

|December 21, 2020

Summary

This summary is machine-generated.

Identifying models with missing covariate data is challenging when data are missing not at random. A shadow variable improves identification for parametric and semiparametric models, enabling robust estimation.

Keywords:

Identification Missing covariate data Missing not at random Shadow variable

More Related Videos

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

Related Experiment Videos

Last Updated: Nov 25, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

Area of Science:

Statistics
Econometrics
Biostatistics

Background:

Missing covariate data poses significant challenges in statistical modeling.
Identification of parametric and semiparametric models is often compromised when data are missing not at random.
Existing methods may fail without strong parametric assumptions or auxiliary information.

Purpose of the Study:

To develop a general approach for identifying parametric and semiparametric models with covariates missing not at random.
To investigate the role of a 'shadow variable' in facilitating model identification.
To extend identification results to generalized linear models with unrestricted missingness processes.

Main Methods:

Illustrating identification challenges with examples for missing not at random covariates.
Proposing a general framework for model identification using a shadow variable.
Developing an inverse probability weighted (IPW) estimator incorporating the shadow variable.
Analyzing identification in generalized linear models under various missingness scenarios.

Main Results:

Identification is not guaranteed for models with missing not at random covariates without auxiliary information.
A fully observed shadow variable, correlated with the missing covariate, broadly enables identification, even in semiparametric models.
The outcome model is identified for common generalized linear models when a shadow variable is present and missingness is unrestricted.
Counterexamples demonstrate scenarios where identification fails even with a shadow variable.

Conclusions:

The use of a shadow variable is crucial for achieving robust identification in models with missing not at random covariates.
The proposed IPW estimator offers a practical approach for estimation in these challenging settings.
The findings have implications for statistical inference in fields with prevalent missing data issues.