Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Confidence Intervals01:21

Confidence Intervals

9.3K
An unbiased point estimate is often insufficient to predict a population estimate, such as population mean or population proportion. In this scenario, a confidence interval is used. A confidence interval is an estimate similar to a sample proportion. However, unlike the point estimate which is a single value, the confidence interval contains a range of values. These values have lower and upper limits, known as confidence limits, and can be designated as L1 and L2, respectively.
A confidence...
9.3K
Uncertainty: Confidence Intervals00:54

Uncertainty: Confidence Intervals

9.9K
The confidence interval is the range of values around the mean that contains the true mean. It is expressed as a probability percentage. The interpretation of a 95% confidence interval, for instance, is that the statistician is 95% confident that the true mean falls within the interval. The upper and lower limits of this range are known as confidence limits. The confidence limits for the true mean are estimated from the sample's mean, the standard deviation, and the statistical factor...
9.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Innovative Clinical Trial Approach for Evaluating Digital Medical Devices Under European Fast-Track Regulatory Frameworks.

Statistics in medicine·2026
Same author

Current validation practice undermines surgical AI development.

ArXiv·2026
Same author

The exposome of brain aging across 34 countries.

Nature medicine·2026
Same author

Applying machine-learning and deep-learning to predict depression from brain MRI and identify depression-related brain biology.

Translational psychiatry·2026
Same author

Evaluation of performance measures in predictive artificial intelligence models to support medical decisions: overview and guidance.

The Lancet. Digital health·2025
Same author

Quantifying multimodal longitudinal brain changes in presymptomatic C9orf72 disease.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2025
Same journal

ContiMorph: An unsupervised learning framework for cardiac motion tracking with time-continuous diffeomorphism.

Medical image analysis·2026
Same journal

MedP-CLIP: Medical CLIP with region-aware prompt integration.

Medical image analysis·2026
Same journal

Multi-organ guided diagnosis of mild cognitive impairment via hierarchical alignment and knowledge distillation.

Medical image analysis·2026
Same journal

SUDA: Simultaneous unsupervised knowledge distillation and adaptation of foundation models for efficient pathological image analysis.

Medical image analysis·2026
Same journal

Beyond the LUMIR challenge: The pathway to foundational registration models.

Medical image analysis·2026
Same journal

Annotation-efficient medical image segmentation via cross-latent graphs and vector-quantized memory.

Medical image analysis·2026
See all related articles

Related Experiment Video

Updated: May 5, 2026

Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images
14:08

Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images

Published on: April 13, 2013

42.4K

Confidence intervals for performance estimates in brain MRI segmentation.

Rosana El Jurdi1, Gaël Varoquaux2, Olivier Colliot1

  • 1Sorbonne Université, Institut du Cerveau - Paris Brain Institute - ICM, CNRS, Inria, Inserm, AP-HP, Hôpital de la Pitié-Salpêtrière, F-75013, Paris, France.

Medical Image Analysis
|May 14, 2025
PubMed
Summary
This summary is machine-generated.

Evaluating medical image segmentation models requires understanding confidence intervals. This study shows that fewer test samples are needed for segmentation than classification to achieve precise performance estimates.

Keywords:
Confidence intervalPerformance measureSegmentationStandard errorStatistical analysisValidation

More Related Videos

Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images
06:48

Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images

Published on: January 7, 2019

8.8K
Author Spotlight: Bridging Gaps in Anatomy and Establishing a Foundation for Algorithmic Studies
04:25

Author Spotlight: Bridging Gaps in Anatomy and Establishing a Foundation for Algorithmic Studies

Published on: December 15, 2023

2.1K

Related Experiment Videos

Last Updated: May 5, 2026

Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images
14:08

Automated Midline Shift and Intracranial Pressure Estimation based on Brain CT Images

Published on: April 13, 2013

42.4K
Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images
06:48

Automated Segmentation of Cortical Grey Matter from T1-Weighted MRI Images

Published on: January 7, 2019

8.8K
Author Spotlight: Bridging Gaps in Anatomy and Establishing a Foundation for Algorithmic Studies
04:25

Author Spotlight: Bridging Gaps in Anatomy and Establishing a Foundation for Algorithmic Studies

Published on: December 15, 2023

2.1K

Area of Science:

  • Medical image analysis
  • Machine learning in healthcare
  • Radiology and neuroimaging

Background:

  • Empirical evaluation of medical segmentation models is inherently noisy due to limited example images.
  • Reporting confidence intervals is crucial for reliable evaluation but is often omitted in medical image segmentation research.
  • The required test set size for accurate confidence intervals depends on performance metric spread, which differs between classification and segmentation tasks.

Purpose of the Study:

  • To investigate confidence interval estimation for 3D brain MRI segmentation.
  • To determine the necessary test set sizes for achieving desired precision in segmentation performance metrics.
  • To compare the sample size requirements for segmentation versus classification tasks.

Main Methods:

  • Experiments were conducted using the nnU-net framework on two Medical Decathlon brain MRI datasets (hippocampus and brain tumor segmentation).
  • The Dice Similarity Coefficient and Hausdorff distance were used as performance measures.
  • Parametric confidence intervals were compared against bootstrap estimates across varying test set sizes and performance metric spreads.

Main Results:

  • Parametric confidence intervals provide reasonable approximations to bootstrap estimates for segmentation tasks.
  • The test set size required for precise segmentation evaluation is often significantly smaller than for classification tasks.
  • Achieving a 1% confidence interval width typically requires 100-200 samples for low-spread metrics (around 3% std dev), while more complex tasks may need over 1000 samples.

Conclusions:

  • Confidence intervals are essential for robust evaluation of medical image segmentation models.
  • The sample size needed for reliable segmentation evaluation is generally lower than previously assumed, especially compared to classification.
  • This research provides practical insights into sample size determination for validating 3D brain MRI segmentation models.