Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Extraction: Partition and Distribution Coefficients

Extraction: Partition and Distribution Coefficients

The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Compacting Factor test

Compacting Factor test

The compacting factor test is a method used to assess the workability of concrete. It is especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Corona: A Virtual Mass Spectrometer for the Development of Real-Time Mass Spectrometry Software.

Analytical chemistry·2026

Same author

Chromatin architectures underlying plasmid-based assays for regulatory variant effects.

Molecular cell·2026

Same author

A unified photosensitizer platform for <i>in situ</i> DNA-, RNA-, and protein-directed proximity labeling.

bioRxiv : the preprint server for biology·2026

Same author

Multimodal analysis of molecular remodeling in aging spleen identified global and cell type specific changes.

bioRxiv : the preprint server for biology·2026

Same author

The proteomic landscape and temporal dynamics of human and mouse gastruloid development.

Nature cell biology·2026

Same author

DNA O-MAP uncovers the molecular neighborhoods associated with specific genomic loci.

eLife·2026

Same journal

Local and Overall Deviance R-Squared Measures for Mixtures of Generalized Linear Models.

Journal of classification·2023

Same journal

Zero-Inflated Time Series Clustering Via Ensemble Thick-Pen Transform.

Journal of classification·2023

Same journal

DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling.

Journal of classification·2023

Same journal

Similarity-Reduced Diversities: the Effective Entropy and the Reduced Entropy.

Journal of classification·2022

Same journal

Editorial: Journal of Classification Vol. 38-3.

Journal of classification·2021

Same journal

Co-clustering of Time-Dependent Data via the Shape Invariant Model.

Journal of classification·2021

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 3, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Suboptimal Comparison of Partitions.

Jonathon J O'Brien¹, Michael T Lawson², Devin K Schweppe¹

¹Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA.

Journal of Classification

|January 9, 2025

Summary

This summary is machine-generated.

Optimal clustering and classification solutions diverge, even with known data models. Standard validation measures cannot guarantee optimal clustering performance, necessitating alternative evaluation approaches for post hoc interpretation.

Keywords:

Classification Clustering Hierarchical receiver operating characteristic Sensitivity Specificity Triplet index

More Related Videos

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

Published on: February 10, 2017

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Related Experiment Videos

Last Updated: Jun 3, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

Published on: February 10, 2017

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Area of Science:

Computational statistics
Machine learning
Data mining

Background:

Classification and clustering are distinct data analysis tasks, often differentiated by the availability of prior labels.
Theoretical analysis reveals that optimal solutions for clustering do not always align with optimal classification, particularly when the data-generating model is known.

Purpose of the Study:

To explore the divergence between optimal clustering and classification.
To identify limitations of standard internal and external validation measures in guaranteeing optimal clustering performance.
To recommend suboptimal evaluation strategies for clustering that offer valuable post hoc interpretation.

Main Methods:

Theoretical exploration of the relationship between clustering and classification optimality.
Analysis of standard internal and external validation indices.
Development and recommendation of alternative evaluation metrics for clustering performance.

Main Results:

No standard internal or external validation measure can ensure correspondence with optimal clustering.
Pairwise linkage-based indices offer clear probabilistic interpretations for clustering.
Triplet-based indices reveal higher-level data structures, and ROC curves from hierarchical clustering dendrograms provide nuanced insights.

Conclusions:

Standard validation metrics are insufficient for guaranteeing optimal clustering outcomes.
Suboptimal evaluation methods, including pairwise and triplet indices, are valuable for post hoc cluster interpretation.
Graphical methods like ROC curves from dendrograms offer richer information than single-number summaries for understanding clustering results.