Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.6K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.6K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

3.6K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
3.6K
Margin of Error01:27

Margin of Error

5.2K
The margin of error is also called the maximum error of an estimate. The margin of error is the maximum possible or expected difference between the observed sample parameter value and the actual population parameter value. For proportion, it is the maximum difference between the value of sample proportion obtained from the data and the true value of population proportion. As the true value of the population parameter is not known, the margin of error is calculated using the sample statistic.
5.2K
Hazard Rate01:11

Hazard Rate

213
The hazard rate, also known as the hazard function or failure rate, is a statistical measure used to describe the instantaneous rate at which an event occurs, given that the event has not yet happened. From a probabilistic perspective, it represents the likelihood that a subject will experience the event in a very small time interval, conditional on surviving up to the beginning of that interval. In terms of frequency, the hazard rate can be viewed as the ratio of the number of events to the...
213
Unusual Results01:16

Unusual Results

3.4K
Unusual results are those that have a very low chance of occurring. Unusual results can be identified using probabilities and the range rule of thumb. In problems involving probability, unusual results can be observed in 2 instances – an unusually high number of successes or an unusually low number of successes.
According to the range rule of thumb, any value above or below two standard deviations, 2σ  from the mean, μ  is considered unusual.
Maximum unusual value =...
3.4K
Empirical Method to Interpret Standard Deviation01:09

Empirical Method to Interpret Standard Deviation

6.4K
The empirical rule, also known as the three-sigma rule, allows a statistician to interpret the standard deviation in a normally distributed dataset. The rule states that 68% of the data lies within one standard deviation from the mean, 95% lies within two standard deviations from the mean, and 99.7% lies within three standard deviations from the mean. Additionally, this rule is also called the 68-95-99.7 rule.
This rule is used widely in statistics to calculate the proportion of data values...
6.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Deciphering the mosaic genome of sugarcane cultivars through polyploid admixture inference with AdmixPoly.

Genome biology·2026
Same author

Integration of proxy intermediate omics traits into a nonlinear two-step model for accurate phenotypic prediction.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2026
Same author

Group Lasso Based Selection for High-Dimensional Mediation Analysis.

Statistics in medicine·2026
Same author

Genomic prediction-aided incorporation of genetic resources into elite breeding: lessons from a collaborative multiparental design in flint maize.

TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik·2025
Same author

Pseudo-observations and super learner for the estimation of the restricted mean survival time.

Lifetime data analysis·2025
Same author

Correction: Serum levels of per- and polyfluoroalkylated substances and methylation of DNA from peripheral blood.

Frontiers in public health·2025
Same journal

Targeted maximum likelihood estimation (TMLE) in regulatory submissions and research: a landscape analysis.

The international journal of biostatistics·2026
Same journal

Predicting birth weight by multivariate functional principal component regressions.

The international journal of biostatistics·2026
Same journal

Robust median regression for count data with general lower truncation using a contaminated discrete Weibull model.

The international journal of biostatistics·2026
Same journal

Handling the uncertainty issue of missingness via a mixture-structure-based method.

The international journal of biostatistics·2026
Same journal

Statistical method for pooling categorical biomarker data from multi-center matched/nested case-control studies.

The international journal of biostatistics·2026
Same journal

Prognostic score methods for the estimation of the average causal effect.

The international journal of biostatistics·2026
See all related articles

Related Experiment Video

Updated: Oct 11, 2025

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects
08:13

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Published on: May 10, 2019

6.5K

Error rate control for classification rules in multiclass mixture models.

Tristan Mary-Huard1,2, Vittorio Perduca3, Marie-Laure Martin-Magniette1,2

  • 1MIA-Paris, INRAE, AgroParisTech, Université Paris-Saclay, Paris, 75005, France.

The International Journal of Biostatistics
|November 30, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a new multiclass False Discovery Rate (FDR)-like rule for finite mixture models, optimizing classification by minimizing Type II errors while controlling Type I errors. The novel rule is less conservative than traditional thresholded Maximum A Posteriori (MAP) rules.

Keywords:
classification ruleerror rate controlmixture models

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.2K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Related Experiment Videos

Last Updated: Oct 11, 2025

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects
08:13

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Published on: May 10, 2019

6.5K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.2K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Area of Science:

  • Statistics
  • Machine Learning
  • Data Science

Background:

  • Finite mixture models are widely used for data clustering and classification.
  • Controlling classification error rates is crucial for reliable model interpretation.
  • Existing methods often use thresholded Maximum A Posteriori (MAP) rules, which can be conservative.

Purpose of the Study:

  • To develop an optimal classification rule for finite mixture models that minimizes Type II error rates while controlling Type I error rates.
  • To define and evaluate a multiclass False Discovery Rate (FDR)-like rule.
  • To compare the performance of the proposed rule against standard thresholded MAP rules.

Main Methods:

  • Defining Type I and Type II classification error rates analogous to statistical test theory.
  • Identifying an optimal region in the observation space for applying the MAP rule.
  • Developing a heuristic for computing the optimal classification rule.
  • Implementing and comparing a multiclass FDR-like rule with thresholded MAP rules.

Main Results:

  • An optimal classification rule is found to correspond to searching an optimal region in the observation space.
  • The shape of this optimal region depends on the misclassification rate to be controlled.
  • The proposed multiclass FDR-like optimal rule demonstrates a less conservative approach compared to thresholded MAP rules.
  • Validation on simulated and real datasets confirms the practical advantages of the FDR-like rule.

Conclusions:

  • The developed optimal classification rule offers improved performance in finite mixture models.
  • The multiclass FDR-like rule provides a more powerful alternative to conventional thresholded MAP rules.
  • This approach enhances the ability to classify observations accurately while maintaining control over error rates.