Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Receiver Operating Characteristic Plot

Receiver Operating Characteristic Plot

A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...

Contingency Table

Contingency Table

A contingency table provides a way of portraying data that can facilitate calculating probabilities. It is a method of displaying a frequency distribution as a table with rows and columns to show how two variables may be dependent (contingent) upon each other; The table helps determine conditional probabilities quite quickly and can help systematically organize, analyze and quantify data. The table displays sample values concerning two variables that may be dependent or contingent on one...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Odds Ratio

Odds Ratio

The odds ratio (OR) is a statistical measure used extensively in epidemiology and research to quantify the strength of association between exposure and outcome across different groups. Unlike relative risk, which compares the probabilities of an event occurring, the odds ratio compares the odds of an event occurring in the exposed group to the odds of it occurring in the unexposed group. The odds, in this context, are calculated as the probability of the event happening divided by the...

Classification of Signals

Classification of Signals

In signal processing, signals are classified based on various characteristics: continuous-time versus discrete-time, periodic versus aperiodic, analog versus digital, and causal versus noncausal. Each category highlights distinct properties crucial for understanding and manipulating signals.
A continuous-time signal holds a value at every instant in time, representing information seamlessly. In contrast, a discrete-time signal holds values only at specific moments, often denoted as x(n), where...

Classification of Systems-I

Classification of Systems-I

Linearity is a system property characterized by a direct input-output relationship, combining homogeneity and additivity.
Homogeneity dictates that if an input x(t) is multiplied by a constant c, the output y(t) is multiplied by the same constant. Mathematically, this is expressed as:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Evaluation of Alternative Smoking Duration Criteria for Lung Cancer Screening.

JAMA internal medicine·2026

Same author

Multilevel Stewardship Intervention for Use of Anticoagulation-Antiplatelet Therapy.

JAMA internal medicine·2026

Same author

Maintenance Pemetrexed/Pembrolizumab Versus Pembrolizumab in Non-Small Cell Lung Cancer: A Propensity Score-Weighted Analysis.

JCO oncology practice·2026

Same author

Comparison of Cardiovascular Disease Risk Estimates Using Enhanced PREVENT Equations.

American journal of preventive medicine·2026

Same author

Communication With Clinicians and Relatives About Cascade Genetic Testing in Cancer Patients With Germline Pathogenic Variants.

JCO precision oncology·2026

Same author

Evaluating the utility of an abbreviated Consolidated Framework for Implementation Research (CFIR) for rapid qualitative analysis: a suicide prevention program case study.

Implementation science communications·2026

Same journal

Interpretable Bayesian Modeling for Multireader Multicase Studies: Addressing Overdispersion and Limited Sample Size in Diagnostic Enhancement Evaluation.

Statistics in medicine·2026

Same journal

Adaptive Sequential Multiple Hypotheses Testing for Concomitant Vaccine Safety Surveillance.

Statistics in medicine·2026

Same journal

Novel Distance Regression for Repeated Outcomes With Missing Data: Applications to Longitudinal and Crossover Studies of Microbiome Beta-Diversity.

Statistics in medicine·2026

Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026

Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026

Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 7, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Corrected ROC analysis for misclassified binary outcomes.

Matthew Zawistowski^1,2, Jeremy B Sussman^1,3, Timothy P Hofer^1,3

¹Veterans Affairs Center for Clinical Management Research, Ann Arbor, 48105, MI, U.S.A.

Statistics in Medicine

|March 1, 2017

Summary

This summary is machine-generated.

Electronic Health Records (EHRs) data often has misclassified outcomes, biasing risk prediction models. This study introduces a novel ROC procedure to correct for misclassification bias, improving accuracy assessment for precision medicine.

Keywords:

ROC analysis electronic health records misclassification precision medicine risk prediction modeling

More Related Videos

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Published on: August 30, 2013

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Related Experiment Videos

Last Updated: Mar 7, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Published on: August 30, 2013

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Comparison of Predictive Performance of Three Lymph Node Staging Systems in Colorectal Signet Ring Cell Carcinoma Based on Machine Learning Model

Published on: April 18, 2025

Area of Science:

Biostatistics
Health Informatics
Machine Learning in Healthcare

Background:

Electronic Health Records (EHRs) are vital for precision medicine but contain imperfect data.
Outcome misclassification in EHR data can significantly bias risk prediction models and accuracy metrics.
Standard Receiver Operating Characteristic (ROC) analysis, particularly the Area Under the Curve (AUC), is susceptible to misclassification bias.

Purpose of the Study:

To investigate the impact of outcome misclassification on the accuracy assessment of risk prediction models using EHR data.
To develop and introduce a novel misclassification-adjusted ROC procedure for bias-corrected AUC estimation.
To demonstrate the effectiveness of the proposed method on a large-scale EHR dataset.

Main Methods:

Studied the effect of misclassification on AUC bias in regression prediction models.
Introduced an intuitive misclassification-adjusted ROC procedure to account for outcome uncertainty.
Applied the correction method to a hospitalization prediction model using EHR data from over 1 million patients.

Main Results:

Misclassification of outcomes in EHR data leads to biased AUC estimates in standard ROC analysis.
Bias in AUC is influenced by false positive/negative rates and disease prevalence.
The proposed misclassification-adjusted ROC procedure effectively produces bias-corrected AUC estimates, outperforming simple correction during model building.

Conclusions:

Accurate risk prediction from EHRs requires addressing outcome misclassification.
The novel misclassification-adjusted ROC procedure offers a computationally simple and effective way to correct AUC bias.
This method is crucial for reliable model comparison and development in precision medicine using large EHR datasets.