Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Data Validation01:15

Data Validation

Method validation is a crucial process in analytical chemistry designed to confirm that a given method consistently produces reliable and high-quality results. This process is essential when a method is applied to different sample matrices or when procedural modifications are made, ensuring that the results meet acceptable standards across various applications.
Key parameters for method validation include:
Data Validation01:03

Data Validation

Data validation is an essential part of a comprehensive assessment. Validation is confirming or verifying and opening the door to gathering more assessment data as it clarifies vague or unclear data. The process of checking and verifying the collected information is called data validation. The primary purpose of data validation is to ensure data is as free from error, bias, and misinterpretation as possible.
Nursing assessment guides are generally based on holistic models rather than medical...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Reliability and Validity01:29

Reliability and Validity

Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
Prediction Intervals01:03

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
The...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Comparison of survival outcomes for people with HR+/HER2- metastatic breast cancer who received palbociclib, ribociclib, or abemaciclib with an aromatase inhibitor: a plain language summary.

Future oncology (London, England)·2026
Same author

Treatment outcomes with palbociclib plus an aromatase inhibitor in patients with metastatic breast cancer who also have cardiovascular diseases: a plain language summary.

Future oncology (London, England)·2026
Same author

Testing rates and outcomes in <i>Clostridioides difficile</i> infection between REM (racially and ethnically minoritized) and non-REM patients.

Infection control and hospital epidemiology·2026
Same author

Evaluating Large Language Models for Automated Evidence Synthesis in Neuroimaging AI: A Multi-Model Benchmark.

Journal of clinical medicine·2026
Same author

An Examination of Buprenorphine and Norbuprenorphine Concentrations in Inpatients Receiving Medications for Opioid Use Disorder.

The journal of applied laboratory medicine·2026
Same author

Are AI Neuroimaging Models Ready for Clinical Use? A Systematic Methodological Review.

Journal of clinical medicine·2026
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
Same journal

Fast Calculation of Feature Contributions in Boosting Trees.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: May 31, 2026

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Mind the Performance Gap: Examining Dataset Shift During Prospective Validation.

Erkin Ötleş1,2, Jeeheh Oh3, Benjamin Li2

  • 1Department of Industrial & Operations Engineering, University of Michigan.

Proceedings of Machine Learning Research
|May 29, 2026
PubMed
Summary
This summary is machine-generated.

Patient risk models degrade in real-world use. This study found that infrastructure changes, not patient or workflow shifts, caused performance gaps in a healthcare-associated infection prediction model. Addressing data infrastructure is key.

Related Experiment Videos

Last Updated: May 31, 2026

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Area of Science:

  • Clinical Informatics
  • Health Services Research
  • Predictive Analytics

Background:

  • Patient risk stratification models are crucial for clinical care but often show decreased performance post-implementation compared to initial retrospective validation.
  • Performance degradation is attributed to temporal shifts (care processes, patient populations) and infrastructure shifts (data access, extraction, transformation).
  • Prospective validation is infrequently reported, hindering understanding of real-world model performance and the factors contributing to performance gaps.

Purpose of the Study:

  • To compare the prospective performance of a patient risk stratification model with its prior retrospective validation performance.
  • To quantify the performance gap between retrospective and prospective validation of a healthcare-associated infection (HAI) prediction model.
  • To differentiate the contributions of temporal shift and infrastructure shift to the observed performance gap.

Main Methods:

  • A patient risk stratification model for predicting HAIs was applied prospectively to 26,864 hospital encounters from July 2020 to June 2021.
  • The prospective performance was compared to the model's retrospective validation performance using data from July 2019 to June 2020.
  • Key performance metrics included Area Under the Receiver Operating Characteristic Curve (AUROC) and Brier score. Temporal and infrastructure shifts were analyzed as contributors to the performance gap.

Main Results:

  • The model achieved a prospective AUROC of 0.767 (95% CI: 0.737, 0.801) and a Brier score of 0.189 (95% CI: 0.186, 0.191).
  • Retrospective validation showed an AUROC of 0.778 (95% CI: 0.744, 0.815) and a Brier score of 0.163 (95% CI: 0.161, 0.165).
  • A performance gap was observed, primarily driven by infrastructure shift rather than temporal shift.

Conclusions:

  • Prospective performance of risk stratification models can differ from retrospective validation, with infrastructure shifts being a significant factor.
  • The study highlights the critical impact of data access, extraction, and transformation processes on model performance in real-world clinical settings.
  • Future model development and validation should account for and mitigate the effects of data infrastructure differences to ensure reliable prospective performance.