Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Receiver Operating Characteristic Plot01:15

Receiver Operating Characteristic Plot

576
A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...
576
Sensitivity, Specificity, and Predicted Value01:13

Sensitivity, Specificity, and Predicted Value

1.8K
In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...
1.8K
Accuracy and Precision01:52

Accuracy and Precision

3.1K
3.1K
Accuracy and Precision01:52

Accuracy and Precision

17.6K
Scientists typically make repeated measurements of a quantity to ensure the quality of their findings and to evaluate both the precision and the accuracy of their results. Measurements are said to be precise if they yield very similar results when repeated in the same manner. A measurement is considered accurate if it yields a result that is very close to the true or the accepted value. Precise values agree with each other; accurate values agree with a true value.  Highly accurate...
17.6K
Accuracy and Errors in Hypothesis Testing01:13

Accuracy and Errors in Hypothesis Testing

682
Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5%...
682
Aggregates Classification01:29

Aggregates Classification

1.2K
Aggregate classification is generally based on its size, petrographic characteristics, weight, and source. Size classification ranges from coarse to fine aggregates, defined by the size of the particles. Coarse aggregates are particles that do not pass through ASTM sieve No. 4, and aggregates that pass through the sieve are fine aggregates.
Petrographic classification groups aggregates based on common mineralogical characteristics. Some of the common mineral groups found in aggregates are...
1.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prediction of Anthracycline-induced Cardiotoxicity Using Cardiac MRI Parameters: An Animal Study.

Radiology. Cardiothoracic imaging·2026
Same author

An AI-driven, wearable, conformal ring system for real-time and user-independent sign language interpretation.

Science advances·2026
Same author

Deep Learning for Survival Prediction in Glioblastoma: Time-dependent Model Interpretability Using MRI, Clinical, and Molecular Data.

Radiology. Artificial intelligence·2026
Same author

Feasibility of Using an AI System for Breast Ultrasonography Interpretation According to Clinical Expertise: Results of a Pilot Study.

Journal of the Korean Society of Radiology·2026
Same author

Uncover This Tech Term: Large Vision-Language Models in Radiology.

Korean journal of radiology·2026
Same author

Phase IB/II Trial with Correlative Analyses of Doxorubicin plus Durvalumab Combination in Patients with Advanced Soft Tissue Sarcoma.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026
Same journal

Comments on "Prognostic Significance of Pretreatment ¹⁸F-FDG PET/CT Parameters in Patients With ER+/HER2- Metastatic Breast Cancer Treated With CDK4/6 Inhibitors Plus Endocrine Therapy".

Korean journal of radiology·2026
Same journal

Automated Breast Ultrasound in Dense-Breast Screening: Beyond Additional Cancer Detection.

Korean journal of radiology·2026
Same journal

Standardizing Obesity Imaging: From Confirmation of Excess Adiposity to Integrated Body Composition Phenotyping.

Korean journal of radiology·2026
Same journal

Response to "Automated Breast Ultrasound in Dense-Breast Screening: Beyond Additional Cancer Detection".

Korean journal of radiology·2026
Same journal

Cerebrospinal Fluid Shunts: An Updated Radiologic Review of Devices, Malfunctions, and Complications.

Korean journal of radiology·2026
Same journal

Response to "Standardizing Obesity Imaging: From Confirmation of Excess Adiposity to Integrated Body Composition Phenotyping".

Korean journal of radiology·2026
See all related articles

Related Experiment Video

Updated: Apr 1, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K

Key Measures for Evaluating Diagnostic Accuracy in Multi-Class Classification: An Overview and Simulation-Based

Leeha Ryu1, Kyunghwa Han2,3, Inkyung Jung4

  • 1Department of Biostatistics and Computing, Yonsei University Graduate School, Seoul, Republic of Korea.

Korean Journal of Radiology
|March 31, 2026
PubMed
Summary
This summary is machine-generated.

Evaluating multi-class classification metrics in AI reveals that while most perform well with balanced data, the M-index and polytomous discrimination index show greater stability with imbalanced datasets, crucial for medical predictive modeling.

Keywords:
AccuracyIndexMeasureMetricsMulticlass classificationPerformancePolytomous outcome prediction

More Related Videos

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

2.8K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

1.3K

Related Experiment Videos

Last Updated: Apr 1, 2026

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

8.1K
Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images
08:20

Author Spotlight: AI-Driven Trypanosome Species Detection from Microscopic Images

Published on: October 27, 2023

2.8K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

1.3K

Area of Science:

  • Artificial Intelligence
  • Medical Informatics
  • Statistical Modeling

Background:

  • AI advancements drive predictive modeling in medicine.
  • Need for robust multi-class classification metrics due to system complexity.
  • Limited comparative studies on multi-class metrics under varied data conditions.

Purpose of the Study:

  • To provide an overview of common multi-class classification accuracy metrics.
  • To systematically evaluate diagnostic accuracy measures via simulation.
  • To offer practical guidance for metric selection in multi-class tasks.

Main Methods:

  • Overview of established multi-class classification metrics.
  • Simulation study across diverse scenarios (3- and 5-class, balanced/imbalanced data, varying predictor distributions).
  • Assessment of bias and 95% confidence interval coverage for each metric.

Main Results:

  • Most metrics showed stable, unbiased performance under balanced conditions.
  • Imbalanced conditions revealed greater bias; M-index and polytomous discrimination index performed more stably.
  • Micro-averaged ROC curve area consistently exhibited higher bias with class imbalance.

Conclusions:

  • Metric performance varies significantly with data balance.
  • M-index and polytomous discrimination index are recommended for imbalanced multi-class medical data.
  • Systematic evaluation aids informed metric selection in AI-driven medical diagnostics.