Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Extraction: Partition and Distribution Coefficients01:14

Extraction: Partition and Distribution Coefficients

1.8K
The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an...
1.8K
Multiple Comparison Tests01:13

Multiple Comparison Tests

3.8K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
3.8K
Comparing the Survival Analysis of Two or More Groups01:20

Comparing the Survival Analysis of Two or More Groups

146
Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...
146
Compacting Factor test01:22

Compacting Factor test

115
The compacting factor test is a method used to assess the workability of concrete. It is  especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...
115
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

144
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
144
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.5K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Corona: A Virtual Mass Spectrometer for the Development of Real-Time Mass Spectrometry Software.

Analytical chemistry·2026
Same author

Chromatin architectures underlying plasmid-based assays for regulatory variant effects.

Molecular cell·2026
Same author

A unified photosensitizer platform for <i>in situ</i> DNA-, RNA-, and protein-directed proximity labeling.

bioRxiv : the preprint server for biology·2026
Same author

Multimodal analysis of molecular remodeling in aging spleen identified global and cell type specific changes.

bioRxiv : the preprint server for biology·2026
Same author

The proteomic landscape and temporal dynamics of human and mouse gastruloid development.

Nature cell biology·2026
Same author

DNA O-MAP uncovers the molecular neighborhoods associated with specific genomic loci.

eLife·2026
Same journal

Local and Overall Deviance R-Squared Measures for Mixtures of Generalized Linear Models.

Journal of classification·2023
Same journal

Zero-Inflated Time Series Clustering Via Ensemble Thick-Pen Transform.

Journal of classification·2023
Same journal

DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling.

Journal of classification·2023
Same journal

Similarity-Reduced Diversities: the Effective Entropy and the Reduced Entropy.

Journal of classification·2022
Same journal

Editorial: Journal of Classification Vol. 38-3.

Journal of classification·2021
Same journal

Co-clustering of Time-Dependent Data via the Shape Invariant Model.

Journal of classification·2021
See all related articles

Related Experiment Video

Updated: Jun 3, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.4K

Suboptimal Comparison of Partitions.

Jonathon J O'Brien1, Michael T Lawson2, Devin K Schweppe1

  • 1Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA.

Journal of Classification
|January 9, 2025
PubMed
Summary
This summary is machine-generated.

Optimal clustering and classification solutions diverge, even with known data models. Standard validation measures cannot guarantee optimal clustering performance, necessitating alternative evaluation approaches for post hoc interpretation.

Keywords:
ClassificationClusteringHierarchical receiver operating characteristicSensitivitySpecificityTriplet index

More Related Videos

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'
10:31

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

Published on: February 10, 2017

11.0K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.3K

Related Experiment Videos

Last Updated: Jun 3, 2025

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.4K
A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'
10:31

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

Published on: February 10, 2017

11.0K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.3K

Area of Science:

  • Computational statistics
  • Machine learning
  • Data mining

Background:

  • Classification and clustering are distinct data analysis tasks, often differentiated by the availability of prior labels.
  • Theoretical analysis reveals that optimal solutions for clustering do not always align with optimal classification, particularly when the data-generating model is known.

Purpose of the Study:

  • To explore the divergence between optimal clustering and classification.
  • To identify limitations of standard internal and external validation measures in guaranteeing optimal clustering performance.
  • To recommend suboptimal evaluation strategies for clustering that offer valuable post hoc interpretation.

Main Methods:

  • Theoretical exploration of the relationship between clustering and classification optimality.
  • Analysis of standard internal and external validation indices.
  • Development and recommendation of alternative evaluation metrics for clustering performance.

Main Results:

  • No standard internal or external validation measure can ensure correspondence with optimal clustering.
  • Pairwise linkage-based indices offer clear probabilistic interpretations for clustering.
  • Triplet-based indices reveal higher-level data structures, and ROC curves from hierarchical clustering dendrograms provide nuanced insights.

Conclusions:

  • Standard validation metrics are insufficient for guaranteeing optimal clustering outcomes.
  • Suboptimal evaluation methods, including pairwise and triplet indices, are valuable for post hoc cluster interpretation.
  • Graphical methods like ROC curves from dendrograms offer richer information than single-number summaries for understanding clustering results.