Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Outliers and Influential Points01:08

Outliers and Influential Points

4.2K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
4.2K
What Are Outliers?01:12

What Are Outliers?

4.1K
Outliers are observed data points that are far from the least squares line. They have unusual values and need to be examined carefully. Though an outlier may result from erroneous data, at other times, it may hold valuable information about the population under study and should be included in the data. Hence, it is crucial to examine what causes a data point to be an outlier.
The z score is used to find outliers or unusual values. It should be noted that any values beyond -2 and +2 are...
4.1K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.9K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.9K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.4K
Steps in Outbreak Investigation01:18

Steps in Outbreak Investigation

176
In the ever-evolving field of public health, statistical analysis serves as a cornerstone for understanding and managing disease outbreaks. By leveraging various statistical tools, health professionals can predict potential outbreaks, analyze ongoing situations, and devise effective responses to mitigate impact. For that to happen, there are a few possible stages of the analysis:
176
Prediction Intervals01:03

Prediction Intervals

2.3K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

MIXPRS enables multi-population and multi-method polygenic risk scores using summary statistics.

Nature genetics·2026
Same author

Analysis Of Salivary Herpesviruses Reveals Associations Between HHV-6 And Long COVID Severity.

medRxiv : the preprint server for health sciences·2026
Same author

Empiric azithromycin alters the upper respiratory microbiome and resistome without anti-inflammatory benefit in COVID-19.

Nature microbiology·2026
Same author

Author Correction: Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors.

Communications medicine·2026
Same author

Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors.

Communications medicine·2026
Same author

Annotation-free discovery of disease-relevant cells in single-cell datasets.

Science advances·2025
Same journal

Simplifying debiased inference via automatic differentiation and probabilistic programming.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Principal stratification with U-statistics under principal ignorability.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Causal K-Means Clustering.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Correction to: Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Harmonized Estimation of Subgroup-Specific Treatment Effects in Randomized Trials: The Use of External Control Data.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
See all related articles

Related Experiment Video

Updated: Sep 2, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K

Prediction and outlier detection in classification problems.

Leying Guan1, Robert Tibshirani2

  • 1Yale University New Haven CT USA.

Journal of the Royal Statistical Society. Series B, Statistical Methodology
|August 1, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces Balanced and Conformal Optimized Prediction Sets (BCOPS) for multi-class classification with differing data distributions. BCOPS optimizes predictions to include correct classes and identify outliers, ensuring reliable performance without distributional assumptions.

Keywords:
BCOPSconformal inferencedistributional changelabel shiftset‐valued prediction

More Related Videos

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

6.9K
Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

10.8K

Related Experiment Videos

Last Updated: Sep 2, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

6.9K
Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

10.8K

Area of Science:

  • Machine Learning
  • Statistical Learning Theory
  • Data Science

Background:

  • Standard multi-class classification assumes training and test data follow identical distributions, which is often violated in real-world scenarios.
  • Outlier detection and robust prediction set construction are crucial for reliable decision-making when data distributions shift.

Purpose of the Study:

  • To propose a novel method, Balanced and Conformal Optimized Prediction Sets (BCOPS), for multi-class classification under distribution shifts.
  • To optimize prediction sets for out-of-sample performance, balancing correct class inclusion and outlier detection.
  • To provide finite sample coverage guarantees without requiring distributional assumptions.

Main Methods:

  • BCOPS combines supervised learning algorithms with conformal prediction principles.
  • It constructs prediction sets C(x) that are subsets of class labels, potentially empty to indicate outliers.
  • The method minimizes a misclassification loss averaged over the out-of-sample distribution.

Main Results:

  • BCOPS provides a finite sample coverage guarantee for prediction sets, irrespective of distributional assumptions.
  • The method demonstrates the ability to detect outliers by returning an empty prediction set.
  • Asymptotic consistency and optimality of the proposed methods are proven under stated assumptions.

Conclusions:

  • BCOPS offers a robust framework for multi-class classification when training and test data distributions differ.
  • The method enhances prediction reliability by incorporating outlier detection and providing coverage guarantees.
  • The proposed outlier detection rate estimation method aids in evaluating classification procedure performance.