Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when researchers try to extrapolate results...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Multi-input and Multi-variable systems

Multi-input and Multi-variable systems

Cruise control systems in cars are designed as multi-input systems to maintain a driver's desired speed while compensating for external disturbances such as changes in terrain. The block diagram for a cruise control system typically includes two main inputs: the desired speed set by the driver and any external disturbances, such as the incline of the road. By adjusting the engine throttle, the system maintains the vehicle's speed as close to the desired value as possible.
In the absence of...

Per-Unit Sequence Models

Per-Unit Sequence Models

An ideal Y-Y transformer, grounded through neutral impedances, displays per-unit sequence networks akin to those of a single-phase ideal transformer when subjected to balanced positive- or negative-sequence currents. These currents do not produce neutral currents, and their associated voltage drops.
Zero-sequence currents, which are identical in magnitude and phase, generate a neutral current, resulting in voltage drops across the neutral impedance and the low-voltage winding. If the...

Truncation in Survival Analysis

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are observed.

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026

Same author

Medicare Insurance Type and Broad Genomic Profiling in Metastatic Cancer.

JAMA network open·2026

Same author

Doubly Robust Estimators of the Restricted Mean Time in Favor Estimands in Individual- and Cluster-Randomized Trials.

Statistics in medicine·2026

Same author

JOINT IDENTIFICATION OF SPATIALLY VARIABLE GENES VIA A NETWORK-ASSISTED BAYESIAN REGULARIZATION APPROACH.

The annals of applied statistics·2026

Same author

Subgroup Analysis of Differential Networks with Latent Variables.

Statistics and computing·2026

Same author

Robust Heterogeneity Adjustment for Gaussian Graphical Model With Latent Variables.

Statistics in medicine·2026

Same journal

Ensuring Quality in Preclinical Research: The Importance of Being Human.

Biometrical journal. Biometrische Zeitschrift·2026

Same journal

Addressing Cluster-Level Treatment Effect Heterogeneity in Sample Size Determination for Hierarchical 2 × 2 Factorial Designs.

Biometrical journal. Biometrische Zeitschrift·2026

Same journal

A Multiple Imputation Approach to Distinguish Curative From Life-Prolonging Effects in the Presence of Missing Covariates.

Biometrical journal. Biometrische Zeitschrift·2026

Same journal

Tests for Categorical Data Beyond Pearson: A Distance Covariance and Energy Distance Approach.

Biometrical journal. Biometrische Zeitschrift·2026

Same journal

Nonparametric Estimation of the Patient-Weighted While-Alive Estimand.

Biometrical journal. Biometrische Zeitschrift·2026

Same journal

Two-Stage Multiple Test Procedures Controlling False Discovery Rate With Auxiliary Variable and Their Application to Set4 <math><semantics><mi>Δ</mi> <annotation>$\Delta$</annotation></semantics></math> Mutant Data.

Biometrical journal. Biometrische Zeitschrift·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 7, 2026

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Multiple augmentation with partial missing regressors.

¹Department of Biostatistics, University of Washington, Seattle 98115, USA. shuangge@u.washington.edu

Biometrical Journal. Biometrische Zeitschrift

|March 21, 2006

Summary

This summary is machine-generated.

This study introduces multiple data augmentation for missing covariate data in large cohort studies. The method provides accurate and affordable estimates for epidemiologic research.

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Related Experiment Videos

Last Updated: Jun 7, 2026

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Epidemiology
Biostatistics
Statistical Modeling

Background:

Missing covariate data is common in large cohort studies.
Missingness can occur by design or chance.
This impacts the reliability of statistical analyses.

Purpose of the Study:

To apply multiple data augmentation techniques to semiparametric models for epidemiologic data with missing regressors.
To address data missing at random (MAR) where probabilities depend on observed regressors, extraneous variables, and the outcome.
To investigate computational algorithms for data augmentation.

Main Methods:

Utilized multiple data augmentation techniques.
Focused on semiparametric models for epidemiologic data.
Assumed data are missing at random (MAR).
Investigated Poor Man's and Asymptotic Normal data augmentations.

Main Results:

Data augmentation approach yielded satisfactory estimates.
The method is computationally affordable.
Achieved asymptotic efficiency comparable to maximum likelihood under certain scenarios.
Applied to Multi-Ethic Study of Atherosclerosis (MESA) and South Wales Nickel Worker Study data.

Conclusions:

Multiple data augmentation is a viable and efficient method for handling missing covariate data in large cohort studies.
The approach is computationally feasible and provides reliable estimates.
Demonstrated practical application in real-world epidemiological datasets.