Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

12.9K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
12.9K
Statistical Significance01:50

Statistical Significance

20.4K
Once data is collected from both the experimental and the control groups, a statistical analysis is conducted to find out if there are meaningful differences between the two groups. A statistical analysis determines how likely any difference found is due to chance (and thus not meaningful). In psychology, group differences are considered meaningful, or significant, if the odds that these differences occurred by chance alone are 5 percent or less. Stated another way, if we repeated this...
20.4K
Sampling Plans01:23

Sampling Plans

297
Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
297
One-Way ANOVA: Equal Sample Sizes01:15

One-Way ANOVA: Equal Sample Sizes

3.5K
One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...
3.5K
Statistical Analysis: Overview01:11

Statistical Analysis: Overview

7.8K
When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...
7.8K
One-Way ANOVA: Unequal Sample Sizes01:15

One-Way ANOVA: Unequal Sample Sizes

5.9K
One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:
5.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A digitally embedded intervention to enhance psychological decentering and reduce depression severity in at-risk adolescents: a randomised controlled trial of the 'One Step Back' programme.

EClinicalMedicine·2026
Same author

Embracing the suboptimal organization of the human brain.

Trends in cognitive sciences·2026
Same author

Canonical neurodevelopmental trajectories of structural and functional manifolds.

eLife·2026
Same author

Computational phenotyping of effort-based decision-making in type-2 diabetes on and off semaglutide.

Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology·2026
Same author

Individual Brain Charting: fifth release of high-resolution fMRI data for cognitive mapping.

Scientific data·2026
Same author

Differential effects of transcranial direct current stimulation on depression symptom clusters.

Journal of affective disorders·2026
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Sep 21, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.1K

Statistical power for cluster analysis.

Edwin S Dalmaijer1, Camilla L Nord2, Duncan E Astle2

  • 1MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK. edwin.dalmaijer@bristol.ac.uk.

BMC Bioinformatics
|June 1, 2022
PubMed
Summary
This summary is machine-generated.

Statistical power for cluster analysis is crucial for identifying subgroups in biomedical data. Sufficient power is achieved with small samples (N=20) and large effect sizes (separation Δ=4), with fuzzy clustering offering advantages for overlapping distributions.

Keywords:
Cluster analysisCovarianceDimensionality reductionEffect sizeLatent class analysisLatent profile analysisSample sizeSimulationStatistical power

More Related Videos

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools
11:29

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Published on: June 20, 2020

9.3K
Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI
06:26

Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI

Published on: November 27, 2019

73.4K

Related Experiment Videos

Last Updated: Sep 21, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

7.1K
Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools
11:29

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Published on: June 20, 2020

9.3K
Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI
06:26

Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI

Published on: November 27, 2019

73.4K

Area of Science:

  • Biomedical data analysis
  • Statistical modeling
  • Machine learning

Background:

  • Cluster algorithms are increasingly used in biomedical research for subgroup identification.
  • Established methods for computing a priori statistical power for cluster analysis are lacking.
  • This study simulates common analysis pipelines to estimate statistical power and classification accuracy.

Purpose of the Study:

  • To estimate statistical power and classification accuracy for common cluster analysis pipelines via simulation.
  • To systematically investigate the influence of subgroup size, number, separation (effect size), and covariance structure.
  • To compare the statistical power of discrete, fuzzy, and finite mixture modeling approaches.

Main Methods:

  • Generated datasets with varied subgroup characteristics (size, number, separation, covariance).
  • Applied dimensionality reduction techniques (none, multi-dimensional scaling, UMAP).
  • Utilized clustering algorithms (k-means, hierarchical clustering, HDBSCAN) and compared discrete, fuzzy, and mixture models.

Main Results:

  • Clustering outcomes depend on large effect sizes or cumulative smaller effects, with minimal impact from covariance structure.
  • Adequate statistical power achieved with small samples (N=20 per subgroup) if cluster separation is large (Δ=4).
  • Fuzzy clustering demonstrates higher power and parsimony for identifying separable multivariate normal distributions, especially with moderate separation (Δ=3).

Conclusions:

  • Statistical power in cluster analysis is critically dependent on effect size (subgroup separation), not solely sample size.
  • Power is satisfactory only for large effect sizes with popular dimensionality reduction and clustering algorithms.
  • Recommendations include applying cluster analysis for large expected subgroup separation, aiming for N=20-30 per subgroup, using multi-dimensional scaling, and considering fuzzy clustering or mixture models for partially overlapping distributions.