Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Regression Toward the Mean01:52

Regression Toward the Mean

7.2K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
7.2K
Multiple Regression01:25

Multiple Regression

4.0K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
4.0K
Correlation and Regression00:53

Correlation and Regression

3.5K
In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...
3.5K
Regression Analysis01:11

Regression Analysis

8.5K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
8.5K
Reliability and Validity01:29

Reliability and Validity

14.1K
Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
14.1K
Microsoft Excel: Regression Analysis01:18

Microsoft Excel: Regression Analysis

1.6K
Regression analysis in Microsoft Excel is a powerful statistical method for examining the relationship between a dependent variable and one or more independent variables. It's used extensively in fields such as economics, biology, and business to predict outcomes, understand relationships, and make data-driven decisions. The most common type is linear regression, which attempts to fit a straight line through the data points to model the relationship between variables.
To perform regression...
1.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026
Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026
Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026
Same journal

Beyond Fixed Thresholds: Optimizing Summaries of Wearable Device Data via Piecewise Linearization of Quantile Functions.

Statistics in medicine·2026
Same journal

A Causal Framework for Evaluating the Total Effect of Strategies Aiming to Expand Screening and to Improve Outcomes.

Statistics in medicine·2026
Same journal

Causal Effects on Nonterminal Event Time With Application to Antibiotic Usage and Future Resistance.

Statistics in medicine·2026
See all related articles

Related Experiment Video

Updated: Feb 15, 2026

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.9K

Model validation and influence diagnostics for regression models with missing covariates.

Paul W Bernhardt1

  • 1Department of Mathematics and Statistics, Villanova University, Villanova, PA 19085, USA.

Statistics in Medicine
|January 11, 2018
PubMed
Summary
This summary is machine-generated.

This study introduces a multiple imputation strategy for regression models with missing covariate data. This approach enables standard residual analyses and influence diagnostics, improving model validation for various response types.

Keywords:
goodness-of-fit testinfluence diagnosticsmissing covariatesmodel validationmultiple imputationresidual analysis

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.6K
Measurements of CO2 Fluxes at Non-Ideal Eddy Covariance Sites
09:05

Measurements of CO2 Fluxes at Non-Ideal Eddy Covariance Sites

Published on: June 24, 2019

8.5K

Related Experiment Videos

Last Updated: Feb 15, 2026

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.9K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.6K
Measurements of CO2 Fluxes at Non-Ideal Eddy Covariance Sites
09:05

Measurements of CO2 Fluxes at Non-Ideal Eddy Covariance Sites

Published on: June 24, 2019

8.5K

Area of Science:

  • Statistics
  • Biostatistics
  • Data Science

Background:

  • Missing covariate data is common in regression analyses.
  • Existing methods for handling missing data primarily focus on parameter estimation, neglecting model validation and diagnostics.
  • Specialized techniques are often required for inference after estimating residuals using expected values.

Purpose of the Study:

  • To propose a multiple imputation strategy for regression models with missing covariate data.
  • To facilitate standard residual analyses and influence diagnostics on imputed datasets.
  • To enhance the validation of response models in the presence of missing covariates.

Main Methods:

  • A multiple imputation strategy is proposed.
  • Standard residual analyses can be applied to imputed datasets or a stacked dataset.
  • The method is demonstrated using linear and logistic regression models.

Main Results:

  • The suggested multiple imputation strategy allows for the application of standard residual analysis techniques.
  • Influence diagnostics can be effectively performed using this imputation method.
  • The approach is validated on real-world datasets (Sleep in Mammals, New York Social Indicators Status).

Conclusions:

  • Multiple imputation offers a viable strategy for handling missing covariate data in regression.
  • This method simplifies model validation and influence diagnostics.
  • The approach is applicable to both linear and logistic regression models.