Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Unrealistic Optimism Bias

Unrealistic Optimism Bias

Unrealistic optimism bias is the tendency to overestimate the likelihood of positive outcomes. This cognitive bias makes individuals believe they are less likely to experience failures, setbacks, or risks and more likely to succeed than others. For example, people may assume they are less prone to health issues, accidents, or financial struggles than their peers, even when they share similar risk factors.One key component of this bias is the above-average effect, where individuals perceive...

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Accuracy and Errors in Hypothesis Testing

Accuracy and Errors in Hypothesis Testing

Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5%...

Bias in Epidemiological Studies

Bias in Epidemiological Studies

Biases can arise at various stages of research, from study design and data collection to analysis and interpretation. Recognizing and addressing these biases is essential to ensure the validity and reliability of epidemiological findings.Broadly speaking, biases in epidemiology fall into three main categories: selection bias, information bias, and confounding. A more detailed description of possible biases is:

Confirmation Biases

Confirmation Biases

The confirmation bias is the tendency to focus on information that confirms our existing beliefs and ignore information that is inconsistent with our expectations. For example, if you think that your professor is not very nice, you notice all of the instances of rude behavior exhibited by the professor while ignoring the countless pleasant interactions he is involved in on a daily basis. Have you ever fallen prey to the confirmation bias, either as the source or target of such bias?

Bias

Bias

Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Whole-population trends in obesity across dimensions of inequality in England, 2019-25: a retrospective, longitudinal cohort study of 54 million adults.

The lancet. Diabetes & endocrinology·2026

Same author

Measurement of quality of stroke care with national electronic health records: a prospective cohort study during and after the COVID-19 pandemic.

BMJ open·2026

Same author

A Bayesian Location-Scale Joint Model for Time-To-Event and Multivariate Longitudinal Data With Association Based on Within-Individual Variability.

Statistics in medicine·2026

Same author

Adolescent Blood Pressure and Cardiovascular Disease Before Age 50 Years.

Hypertension (Dallas, Tex. : 1979)·2026

Same author

Cardiac rehabilitation after transcatheter aortic valve implantation before, during and after the COVID-19 pandemic: a whole-population study.

Heart (British Cardiac Society)·2026

Same author

Elective Cesarean Section for Maternal Preference.

The New England journal of medicine·2026

Same journal

Correction to: Home dampness and molds and occurrence of respiratory tract infections in the first 27 years of life: the Espoo Cohort Study.

American journal of epidemiology·2026

Same journal

A SIMPLE AND POWERFUL TEST OF VACCINE WANING.

American journal of epidemiology·2026

Same journal

Association Between maternal body mass index, offspring growth and pubertal timing: results from a longitudinal birth cohort study.

American journal of epidemiology·2026

Same journal

Correction to: Developing a novel algorithm to identify incident and prevalent dementia in Medicare claims-the ARIC Study.

American journal of epidemiology·2026

Same journal

RE: advancing observational research on arts and health: theory-informed approaches using the RADIANCE framework.

American journal of epidemiology·2026

Same journal

Maternal Cesarean Section and Offspring ASD or ADHD Risk: A Nurses' Health Study II Analysis.

American journal of epidemiology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 27, 2026

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

Correcting for optimistic prediction in small data sets.

Gordon C S Smith, Shaun R Seaman, Angela M Wood

American Journal of Epidemiology

|June 27, 2014

Summary

This summary is machine-generated.

Optimistic C statistic estimates in screening tests are common. Cross-validation with replication, bootstrapping, and leave-pair-out cross-validation provide unbiased adjustments, outperforming other methods in clinical data analysis.

Keywords:

logistic models models, statistical multivariate analysis receiver operating characteristic curve

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Published on: March 4, 2022

Related Experiment Videos

Last Updated: Apr 27, 2026

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Automatic Image Processing to Determine the Community Size Structure of Riverine Macroinvertebrates

Published on: January 13, 2023

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Assessment of Mouse Judgment Bias through an Olfactory Digging Task

Published on: March 4, 2022

Area of Science:

Biostatistics
Medical Screening
Statistical Modeling

Background:

The C statistic is a key metric for evaluating screening test accuracy.
Overfitting in small datasets often leads to overestimation (optimism) of the C statistic.
Existing methods to correct for optimism are diverse and some introduce bias.

Purpose of the Study:

To evaluate and compare different methods for adjusting the optimism of the C statistic.
To identify reliable methods for obtaining unbiased C statistic estimates in clinical screening.

Main Methods:

Analysis of UK Down syndrome and Scottish national pregnancy discharge clinical datasets.
Comparison of sample splitting, various cross-validation techniques (leave-1-out, with replication), and bootstrapping.
Evaluation of a novel method: leave-pair-out cross-validation.

Main Results:

Sample splitting, leave-1-out, and cross-validation without replication yielded biased C statistic estimates with higher errors.
Cross-validation with replication, bootstrapping, and leave-pair-out cross-validation produced unbiased estimates with comparable errors.
In simulations, these three methods performed similarly, though bootstrapping showed lower errors with limited data.

Conclusions:

Cross-validation with replication, bootstrapping, and leave-pair-out cross-validation are recommended for unbiased C statistic adjustment.
These methods offer reliable performance across different dataset sizes and C statistic values.
Careful selection of optimism adjustment methods is crucial for accurate screening test evaluation.