Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Mar 12, 2026

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

8.2K

Task-Based Sampling of Patient Data for Rigorous Machine Learning/AI Performance Assessment.

Natalie Baughan1,2, Heather M Whitney3, Karen Drukker3

  • 1Department of Radiation Oncology, Henry Ford Health, Detroit, MI, 48202, USA. nbaugha1@hfhs.org.

Journal of Imaging Informatics in Medicine
|March 10, 2026
PubMed
Summary

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Physics-informed data augmentation to simulate low dose CT scans: Application to lung nodule detection.

Medical physics·2026
Same author

Generalizations of the Jaccard index and Sørensen index for assessing agreement across multiple readers in object detection and instance segmentation in biomedical imaging.

Journal of medical imaging (Bellingham, Wash.)·2026
Same author

Ethical Responsibility in the Off-Label Use of AI in Medical Imaging.

The Journal of clinical ethics·2026
Same author

Synthetic data in radiological imaging: current state and future outlook.

BJR artificial intelligence·2026
Same author

Using a Physics-Based Approach to Standardize Radiomics Values: Experimental Validation in an Anthropomorphic Phantom on a Clinical CT Scanner Using a Range of Dose Levels and Reconstruction Kernels.

Proceedings of SPIE--the International Society for Optical Engineering·2026
Same author

A Generative Model of Lung CT Conditioned on Radiomics Features.

Proceedings of SPIE--the International Society for Optical Engineering·2026
Same journal

Kolmogorov-Arnold Guided Local-Global Attention for Medical Image Classification.

Journal of imaging informatics in medicine·2026
Same journal

Artificial Intelligence-Assisted Inner Ear Computed Tomography Analysis: Radiomics-Based Comparison of Affected and Unaffected Ears in Idiopathic Sudden Sensorineural Hearing Loss.

Journal of imaging informatics in medicine·2026
Same journal

High Adoption, Higher Expectations: A Cross-Sectional Survey of Radiologist Engagement with Artificial Intelligence in the United Arab Emirates.

Journal of imaging informatics in medicine·2026
Same journal

Complex-valued Multi-scale Hybrid Attention Network for Fast MRI via Sparsified Data Learning.

Journal of imaging informatics in medicine·2026
Same journal

Automatic Phase and Sequence Identification in Gd-EOB-DTPA-Enhanced Liver MRI Using Deep Convolutional and Sequential Learning.

Journal of imaging informatics in medicine·2026
Same journal

Ultrasound-Based AI in Predicting Hormone Receptor Status in Breast Cancer: Is "Digital Biopsy" Possible.

Journal of imaging informatics in medicine·2026
See all related articles
This summary is machine-generated.

A new task-based sampling algorithm helps create representative AI training datasets. This method reduces sampling bias by matching data to intended patient populations for improved AI performance assessment.

Area of Science:

  • Medical informatics
  • Artificial intelligence in healthcare
  • Data science

Background:

  • AI algorithm performance assessment requires independent datasets representative of the intended clinical population.
  • Using all available data can be impractical and may introduce sampling bias.
  • Representative data is crucial for reliable AI model training and validation.

Purpose of the Study:

  • To develop and demonstrate a computational method for task-based data sampling from large repositories.
  • To generate datasets matched to specific demographic and clinical profiles for AI performance assessment.
  • To mitigate sampling bias in AI algorithm development and evaluation.

Main Methods:

  • A task-based sampling algorithm was developed, requiring users to define an initial cohort, target distribution, and allowable deviation.
Keywords:
Algorithm performanceBias mitigationData samplingImage databaseMachine learning

More Related Videos

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

2.7K

Related Experiment Videos

Last Updated: Mar 12, 2026

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

8.2K
Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

2.7K
  • The algorithm was applied to the Medical Imaging and Data Resource Center (MIDRC) data commons.
  • Demographic characteristics and disease states were used as clinical attributes for matching to an intended population profile (e.g., CDC demographics).
  • Main Results:

    • The algorithm successfully sampled cohorts (542 and 870 patients) from an initial >4000 patient cohort.
    • Sampled cohorts closely matched the target demographic distribution with low average clinical attribute differences (1.0% and 2.1%).
    • The method demonstrated effectiveness in generating matched samples for AI performance assessment.

    Conclusions:

    • The developed task-based sampling algorithm effectively generates matched samples from large datasets.
    • This approach reduces sampling bias, enhancing the reliability of AI algorithm training and performance assessment.
    • The method provides a valuable tool for creating representative datasets in medical AI research.