Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Significance Testing: Overview01:04

Significance Testing: Overview

3.8K
Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...
3.8K
Sign Test for Matched Pairs01:17

Sign Test for Matched Pairs

207
The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in...
207
Introduction to the Sign Test01:10

Introduction to the Sign Test

990
The sign test is an important tool in nonparametric statistics, offering a straightforward yet effective method for analyzing matched pairs, nominal data, or hypotheses concerning the median of a population. It transforms data points into positive or negative signs, avoiding the need for assumptions about data distribution and instead focusing on the direction of change. It is particularly valuable when data does not conform to the normal distribution requirements of many parametric tests. For...
990
Bonferroni Test01:10

Bonferroni Test

2.8K
The Bonferroni test is a statistical test named after Carlo Emilio Bonferroni, an Italian mathematician best known for Bonferroni inequalities. This statistical test is a type of multiple comparison test to determine which means are different than the rest. Bonferroni test can minimize the Type 1 error by reducing the significance level alpha, which otherwise increases with sample pairs.
The means of different samples are first paired in all possible combinations.
The null hypothesis of the...
2.8K
One-Way ANOVA: Unequal Sample Sizes01:15

One-Way ANOVA: Unequal Sample Sizes

5.9K
One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:
5.9K
Sign Test for Nominal Data01:12

Sign Test for Nominal Data

152
The sign test is a nonparametric method used to evaluate hypotheses about the median of a single sample or to compare the medians of two related samples. The sign test is particularly useful when dealing with nominal data, which includes distinct categories without an inherent order, such as names, labels, and preferences. Nominal data restricts statistical analysis to evaluating population proportions rather than mean or median values that require continuous data.
For example, consider a...
152

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Elastic Shape Analysis of Movement Data.

Journal of the American Statistical Association·2026
Same author

Determining optimal diet/exercise treatment assignment for patients with symptomatic knee osteoarthritis using baseline gait forces.

Osteoarthritis and cartilage open·2025
Same author

Prognostic significance of CD8+ T cell Spatial Biomarkers in ER+ and ER- breast cancer: A retrospective cohort study.

PLoS medicine·2025
Same author

3Mont: A multi-omics integrative tool for breast cancer subtype stratification.

PloS one·2025
Same author

Multifaceted Neuroimaging Data Integration via Analysis of Subspaces.

Psychometrika·2025
Same author

Prognostic Significance of CD8 T-cell Spatial Biomarkers in ER+ and ER- Breast Cancer.

medRxiv : the preprint server for health sciences·2025
Same journal

Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

fastkqr: A Fast Algorithm for Kernel Quantile Regression.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Empirical Bayes Covariance Decomposition, and a Solution to the Multiple Tuning Problem in Sparse PCA.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Joint Registration and Conformal Prediction for Partially Observed Functional Data.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Efficient Decision Trees for Tensor Regressions.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Distributed Nonparametric Regression with Heterogeneity Through Prediction-Based Aggregation.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
See all related articles

Related Experiment Video

Updated: Sep 10, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.0K

Powerful significance testing for unbalanced clusters.

Thomas H Keefe1, J S Marron1

  • 1Department of Statistics & O.R., UNC-Chapel Hill.

Journal of Computational and Graphical Statistics : a Joint Publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
|August 26, 2025
PubMed
Summary
This summary is machine-generated.

Statistical cluster validation is crucial for identifying true data structures. A new method improves upon SigClust for unbalanced cluster sizes, enhancing disease subtype discovery.

Keywords:
SigClustclass imbalancecluster validationclusteringhypothesis testingk-meansunsupervised learning

More Related Videos

Flypub To Study Ethanol Induced Behavioral Disinhibition and Sensitization
08:13

Flypub To Study Ethanol Induced Behavioral Disinhibition and Sensitization

Published on: May 18, 2020

6.7K
Combined Immunofluorescence and DNA FISH on 3D-preserved Interphase Nuclei to Study Changes in 3D Nuclear Organization
13:55

Combined Immunofluorescence and DNA FISH on 3D-preserved Interphase Nuclei to Study Changes in 3D Nuclear Organization

Published on: February 3, 2013

18.6K

Related Experiment Videos

Last Updated: Sep 10, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.0K
Flypub To Study Ethanol Induced Behavioral Disinhibition and Sensitization
08:13

Flypub To Study Ethanol Induced Behavioral Disinhibition and Sensitization

Published on: May 18, 2020

6.7K
Combined Immunofluorescence and DNA FISH on 3D-preserved Interphase Nuclei to Study Changes in 3D Nuclear Organization
13:55

Combined Immunofluorescence and DNA FISH on 3D-preserved Interphase Nuclei to Study Changes in 3D Nuclear Organization

Published on: February 3, 2013

18.6K

Area of Science:

  • Data Science
  • Statistics
  • Bioinformatics

Background:

  • Clustering methods reveal data structure, especially in high dimensions.
  • Statistical cluster validation assesses the reality of discovered clusters.
  • The SigClust method is a benchmark but underperforms with unbalanced cluster sizes.

Purpose of the Study:

  • To address the limitations of SigClust in validating clusters with unequal sizes.
  • To propose a novel, powerful cluster validation method for both balanced and unbalanced data.
  • To improve the detection of rare subtypes in high-dimensional datasets.

Main Methods:

  • A novel generalization of k-means clustering was developed.
  • The proposed method enhances statistical cluster validation.
  • The approach was tested on high-dimensional gene expression data.

Main Results:

  • The new method demonstrates superior power in cluster validation, particularly with unbalanced cluster sizes.
  • The SigClust method's underperformance in unbalanced settings was explained.
  • The method proved effective in a real-world application with kidney cancer data.

Conclusions:

  • The developed method offers a powerful and versatile tool for statistical cluster validation.
  • This advancement is particularly valuable for identifying rare subtypes in complex datasets.
  • The study provides a Python implementation for practical application.