Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Bias in Epidemiological Studies

Bias in Epidemiological Studies

Biases can arise at various stages of research, from study design and data collection to analysis and interpretation. Recognizing and addressing these biases is essential to ensure the validity and reliability of epidemiological findings.Broadly speaking, biases in epidemiology fall into three main categories: selection bias, information bias, and confounding. A more detailed description of possible biases is:

Bias

Bias

Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Bonferroni Test

Bonferroni Test

The Bonferroni test is a statistical test named after Carlo Emilio Bonferroni, an Italian mathematician best known for Bonferroni inequalities. This statistical test is a type of multiple comparison test to determine which means are different than the rest. Bonferroni test can minimize the Type 1 error by reducing the significance level alpha, which otherwise increases with sample pairs.
The means of different samples are first paired in all possible combinations.
The null hypothesis of the...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Relationship Between Long-Term Exposure to Airborne Particulate Matter and the Intrinsic Capacity of Middle-Aged and Older Adults in China: A Retrospective Cohort Study Based on CHARLS.

Clinical and experimental pharmacology & physiology·2026

Same author

Cascade Sensor Array Involving Nanozyme with Glycoside Hydrolase-like Activity for the Accurate Identification of Six Aminoglycosides.

Analytical chemistry·2026

Same author

Research on the separation of capture and inelastic scattering gamma-ray spectra in prompt gamma neutron activation analysis technology based on direct-current neutron generators.

The Review of scientific instruments·2026

Same author

Leveraging interpretable machine learning to identify sarcopenia in middle-aged and older adults with intrinsic capacity decline: an analysis of CHARLS data under AWGS 2025.

BMC medical informatics and decision making·2026

Same author

Plasma protein GDF15 has a good predictive potential for the kidney complications of type 2 diabetes.

Frontiers in endocrinology·2026

Same author

DisenTS: Disentangled Channel Evolving Pattern Modeling for Multivariate Time Series Forecasting.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Relation DETR+: Exploring Explicit Position Relation Prior for Dense Prediction.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

RBF++: Quantifying and Optimizing Reasoning Boundaries across Measurable and Unmeasurable Capabilities for Chain-of-Thought Reasoning.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

CAFE: Cross-View Adaptive Fusion and Cluster Center Enhancement for Robust Multi-View Clustering.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

DIVER: Reinforced Diffusion Breaks Imitation Bottlenecks in End-to-End Autonomous Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Ethics-Aware Safe Reinforcement Learning for Rare-Event Risk Control in Interactive Urban Driving.

IEEE transactions on pattern analysis and machine intelligence·2026

Same journal

Learning Shape Anchors for Holistic Indoor Scene Understanding.

IEEE transactions on pattern analysis and machine intelligence·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 11, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Efficient Benchmarking via Bias-Bounded Subset Selection.

Yan Zhuang, Junhao Yu, Qi Liu

IEEE Transactions on Pattern Analysis and Machine Intelligence

|August 12, 2025

Summary

This summary is machine-generated.

Efficiently evaluating artificial intelligence (AI) systems requires selecting optimal benchmark subsets. This study introduces a greedy algorithm that guarantees accurate AI model score estimation using significantly less data, reducing computational costs.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Related Experiment Videos

Last Updated: Sep 11, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Area of Science:

Artificial Intelligence
Machine Learning Evaluation
Computational Efficiency

Background:

Evaluating large artificial intelligence (AI) models is computationally intensive and costly.
Current methods for AI model evaluation often require extensive benchmarks, leading to high resource expenditure.
There is a need for efficient AI evaluation techniques that minimize computational and human costs.

Purpose of the Study:

To formally define and approximate the subset selection problem for efficient AI model evaluation.
To develop a method for identifying valuable benchmark subsets that ensures theoretical guarantees.
To reduce the cost and improve the efficiency of evaluating large AI models.

Main Methods:

Formal definition and approximation of the subset selection problem in efficient AI evaluation.
Proof that the subset selection problem optimizes a submodular function.
Application of a simple greedy algorithm for unified subset identification.

Main Results:

The proposed method provides the first theoretical guarantees for bias control and generalizability in AI score estimation.
Experimental results with language models across 11 benchmarks demonstrate superior performance.
Accurate AI model score estimation achieved using only 30% of the full benchmark.

Conclusions:

A greedy algorithm effectively solves the subset selection problem for efficient AI evaluation.
This approach significantly reduces the data required for accurate AI model performance estimation.
The findings facilitate efficient and sparse benchmark design for AI systems.