Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Statistical Software for Data Analysis and Clinical Trials

Statistical Software for Data Analysis and Clinical Trials

Statistical software is pivotal in data analysis and clinical trials by providing tools to analyze data, draw conclusions, and make predictions. These software packages range from simple data management applications to complex analytical platforms, supporting various statistical tests, models, and simulation techniques. Their significance lies in their ability to handle vast amounts of data with precision and efficiency, enabling researchers to validate hypotheses, identify trends, and make...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Ensuring Fairness in Detecting Mild Cognitive Impairment with MRI.

AMIA ... Annual Symposium proceedings. AMIA Symposium·2025

Same author

Enhancing clinical outcome predictions through effective sample size evaluation in graph-based digital twin modeling.

BioData mining·2025

Same author

Perceptual and technical barriers in sharing and formatting metadata accompanying omics studies.

Cell genomics·2025

Same author

Erratum: A latent transfer learning method for estimating hospital-specific post-acute healthcare demands following SARS-CoV-2 infection.

Patterns (New York, N.Y.)·2025

Same author

AI as an accelerator for defining new problems that transcends boundaries.

BioData mining·2025

Same author

Preoperative anemia is an unsuspecting driver of machine learning prediction of adverse outcomes after lumbar spinal fusion.

The spine journal : official journal of the North American Spine Society·2025

Same journal

Interpretable machine learning for Parkinson's disease diagnosis, staging, and biological mechanism exploration: a multicenter analysis.

BioData mining·2026

Same journal

Learning a distance for the clustering of patients with amyotrophic lateral sclerosis.

BioData mining·2026

Same journal

Multi-domain feature fusion with variational mode decomposition and hybrid LightGBM-Logistic Regression for multi-class seizure classification.

BioData mining·2026

Same journal

Large-scale transcriptomic data mining using explainable XGBoost and SHAP reveals shared biomarkers and molecular mechanisms between type-2 diabetes and triple-negative breast cancer for drug repurposing.

BioData mining·2026

Same journal

AVSeg-XAI: Deep learning framework for A/V segmentation with vascular features reveals retinal oculomics as biomarker for cardiovascular disease.

BioData mining·2026

Same journal

Navigating the uncharted: AI-driven advances in protein structure, dynamics, interactions and ligand interactions for understudied families.

BioData mining·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 17, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

PMLB: a large benchmark suite for machine learning evaluation and comparison.

Randal S Olson¹, William La Cava¹, Patryk Orzechowski^1,2

¹Institute for Biomedical Informatics, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, 19104 PA USA.

|December 15, 2017

Summary

This summary is machine-generated.

Selecting machine learning benchmarks is challenging due to inconsistent datasets. This study introduces a curated resource, finding current benchmarks lack diversity for effective machine learning algorithm evaluation.

Keywords:

Benchmarking Data repository Machine learning Model evaluation

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A User-friendly and Powerful R Analysis of Large-scale Datasets

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

Related Experiment Videos

Last Updated: Feb 17, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A User-friendly and Powerful R Analysis of Large-scale Datasets

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

Area of Science:

Computer Science
Data Mining
Machine Learning

Background:

Selecting and developing machine learning (ML) methods is complex, influenced by study-specific goals.
Inconsistent organization of benchmark datasets creates a burden for ML practitioners and data scientists.
A need exists for standardized, accessible benchmark resources.

Purpose of the Study:

To introduce a public, curated benchmark resource for evaluating ML methodologies.
To characterize the diversity of available datasets using meta-feature comparison.
To analyze the performance clustering of ML algorithms across benchmark datasets.

Main Methods:

Development of an accessible, curated public benchmark resource.
Meta-feature comparison of datasets within the resource to assess data diversity.
Application of established ML methods to the benchmark suite for performance analysis.

Main Results:

The study identified a lack of diversity in existing ML benchmarks.
Analysis revealed gaps in current benchmarking problems.
Performance clustering indicated limitations in the ability of current benchmarks to differentiate ML algorithms effectively.

Conclusions:

The developed resource aids in understanding ML methodology limitations.
This work is a step towards more diverse and efficient future benchmarking standards.
It highlights the need for improved and expanded benchmark suites for robust ML evaluation.