Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Home
Task-based Sampling Of Patient Data For Rigorous Machine Learning/ai Performance Assessment.

Home
Task-based Sampling Of Patient Data For Rigorous Machine Learning/ai Performance Assessment.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Physics-informed data augmentation to simulate low dose CT scans: Application to lung nodule detection.

Medical physics·2026

Same author

Generalizations of the Jaccard index and Sørensen index for assessing agreement across multiple readers in object detection and instance segmentation in biomedical imaging.

Journal of medical imaging (Bellingham, Wash.)·2026

Same author

Ethical Responsibility in the Off-Label Use of AI in Medical Imaging.

The Journal of clinical ethics·2026

Same author

Synthetic data in radiological imaging: current state and future outlook.

BJR artificial intelligence·2026

Same author

Using a Physics-Based Approach to Standardize Radiomics Values: Experimental Validation in an Anthropomorphic Phantom on a Clinical CT Scanner Using a Range of Dose Levels and Reconstruction Kernels.

Proceedings of SPIE--the International Society for Optical Engineering·2026

Same author

A Generative Model of Lung CT Conditioned on Radiomics Features.

Proceedings of SPIE--the International Society for Optical Engineering·2026

Same journal

How is Bias Learned in Medical Image Analysis Models? An Exploration of the Encoding of Demographic Information in Deep Learning Models Trained to Detect Abnormalities on Chest X-Rays.

Journal of imaging informatics in medicine·2026

Same journal

AI-Based Opportunistic CT Risk Assessment Using TotalSegmentator in Osteoporotic Vertebral Fractures.

Journal of imaging informatics in medicine·2026

Same journal

A Computationally Efficient and Improved Brain Tumor Recognition System by MRI-Segmentation Integrated Classification Network.

Journal of imaging informatics in medicine·2026

Same journal

Ultrasound Domain Adaptation for Robust Kidney Segmentation via Spectral-Similarity-Guided Translation.

Journal of imaging informatics in medicine·2026

Same journal

Dynamic Fuzzy-Gaussian Modeling (DynFGM): A Kurtosis-Adaptive Unsupervised Framework for Automated Adipose Tissue Segmentation in Abdominal MRI.

Journal of imaging informatics in medicine·2026

Same journal

Computerized Classification Method for Glioma Molecular Subtypes on Brain MR Images Using SAM-Med3D with Low-Rank Adaptation.

Journal of imaging informatics in medicine·2026

See all related articles

Related Experiment Video

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Task-Based Sampling of Patient Data for Rigorous Machine Learning/AI Performance Assessment.

Natalie Baughan^1,2, Heather M Whitney³, Karen Drukker³

¹Department of Radiation Oncology, Henry Ford Health, Detroit, MI, 48202, USA. nbaugha1@hfhs.org.

Journal of Imaging Informatics in Medicine

|March 10, 2026

View abstract on PubMed

Summary

This summary is machine-generated.

A new task-based sampling algorithm helps create representative AI training datasets. This method reduces sampling bias by matching data to intended patient populations for improved AI performance assessment.

Keywords:

Algorithm performance Bias mitigation Data sampling Image database Machine learning

More Related Videos

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

Related Experiment Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

Area of Science:

Medical informatics
Artificial intelligence in healthcare
Data science

Background:

AI algorithm performance assessment requires independent datasets representative of the intended clinical population.
Using all available data can be impractical and may introduce sampling bias.
Representative data is crucial for reliable AI model training and validation.

Purpose of the Study:

To develop and demonstrate a computational method for task-based data sampling from large repositories.
To generate datasets matched to specific demographic and clinical profiles for AI performance assessment.
To mitigate sampling bias in AI algorithm development and evaluation.

Main Methods:

A task-based sampling algorithm was developed, requiring users to define an initial cohort, target distribution, and allowable deviation.
The algorithm was applied to the Medical Imaging and Data Resource Center (MIDRC) data commons.
Demographic characteristics and disease states were used as clinical attributes for matching to an intended population profile (e.g., CDC demographics).

Main Results:

The algorithm successfully sampled cohorts (542 and 870 patients) from an initial >4000 patient cohort.
Sampled cohorts closely matched the target demographic distribution with low average clinical attribute differences (1.0% and 2.1%).
The method demonstrated effectiveness in generating matched samples for AI performance assessment.

Conclusions:

The developed task-based sampling algorithm effectively generates matched samples from large datasets.
This approach reduces sampling bias, enhancing the reliability of AI algorithm training and performance assessment.
The method provides a valuable tool for creating representative datasets in medical AI research.