Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Receiver Operating Characteristic Plot

Receiver Operating Characteristic Plot

A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...

Sensitivity, Specificity, and Predicted Value

Sensitivity, Specificity, and Predicted Value

In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...

Residuals and Least-Squares Property

Residuals and Least-Squares Property

The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...

Sign Test for Nominal Data

Sign Test for Nominal Data

The sign test is a nonparametric method used to evaluate hypotheses about the median of a single sample or to compare the medians of two related samples. The sign test is particularly useful when dealing with nominal data, which includes distinct categories without an inherent order, such as names, labels, and preferences. Nominal data restricts statistical analysis to evaluating population proportions rather than mean or median values that require continuous data.
For example, consider a...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with data...

Response Surface Methodology

Response Surface Methodology

Response Surface Methodology (RSM) is a collection of statistical and mathematical techniques used to develop, improve, and optimize processes. It is particularly valuable when many input variables or factors potentially influence a response variable.
The process of RSM involves several key steps:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Outcome and Exposure Polygenic Risk Scores Can Help Reduce Information Bias and Selection Bias in Regression Estimates From Biobank Data.

Genetic epidemiology·2026

Same author

Maternal inflammation and oxidative stress during pregnancy and emotional-behavioral problems in children aged 1.5-3 years: A longitudinal repeated-measures study.

Journal of affective disorders·2026

Same author

Privacy-enhancing sequential learning under heterogeneous selection bias in multi-site electronic health records data.

Journal of the American Medical Informatics Association : JAMIA·2026

Same author

Evaluation of integrated, multimedia biomarkers of prenatal metals exposure in association with child neurodevelopment in Puerto Rico.

Journal of exposure science & environmental epidemiology·2026

Same author

Prenatal phthalate exposure and emotional-behavioral problems in children aged 1.5 to 3 years from the PROTECT birth cohort.

Journal of exposure science & environmental epidemiology·2026

Same author

The case for an integrated biobanking initiative in South Asia.

The Lancet regional health. Southeast Asia·2026

Same journal

Interpretable Bayesian Modeling for Multireader Multicase Studies: Addressing Overdispersion and Limited Sample Size in Diagnostic Enhancement Evaluation.

Statistics in medicine·2026

Same journal

Adaptive Sequential Multiple Hypotheses Testing for Concomitant Vaccine Safety Surveillance.

Statistics in medicine·2026

Same journal

Novel Distance Regression for Repeated Outcomes With Missing Data: Applications to Longitudinal and Crossover Studies of Microbiome Beta-Diversity.

Statistics in medicine·2026

Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026

Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026

Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 8, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Propensity score-based diagnostics for categorical response regression models.

Philip S Boonstra¹, Irina Bondarenko¹, Sung Kyun Park²

¹Department of Biostatistics, University of Michigan, Ann Arbor, MI, 48109, USA.

Statistics in Medicine

|August 13, 2013

Summary

This summary is machine-generated.

This study introduces a new method for assessing statistical model fit using balancing scores, inspired by causal inference techniques. This approach helps identify potential model misspecification in logistic and proportional odds models.

Keywords:

balancing score multinomial logistic proportional odds residual diagnostic score test

More Related Videos

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

Related Experiment Videos

Last Updated: May 8, 2026

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting (Propensity Score) using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

Area of Science:

Statistics
Biostatistics
Epidemiology

Background:

Goodness-of-fit statistics for categorical response models typically partition subjects based on predicted probabilities.
Existing methods rely on predicted response probabilities (propensity scores) for model assessment.
A need exists for robust diagnostics applicable to various sampling designs and capable of detecting general misspecification.

Purpose of the Study:

To introduce a novel retrospective approach for assessing goodness-of-fit in statistical models.
To adapt causal inference balancing scores for model adequacy diagnostics.
To develop and generalize model diagnostics for binary logistic and proportional odds models.

Main Methods:

Utilized a retrospective approach by borrowing the concept of balancing scores from causal inference.
Inspected the conditional distribution of predictors given propensity scores within each response category.
Developed graphical and numerical summaries for binary logistic models and generalized them for proportional odds models.

Main Results:

The proposed balancing score diagnostics can be applied to both prospective and retrospective sampling designs.
The methods demonstrated the ability to ascertain general forms of model misspecification.
Simulations and real-world data examples (Parkinson's disease, diabetes biomarkers) illustrated the utility of the proposed diagnostics.

Conclusions:

The balancing score approach offers a valuable new tool for assessing the adequacy of statistical models, particularly logistic and proportional odds models.
This method provides a flexible and powerful way to detect model misspecification across different study designs.
The diagnostics are illustrated with practical applications, showing their relevance in epidemiological research.