Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Statistical Significance

Statistical Significance

Once data is collected from both the experimental and the control groups, a statistical analysis is conducted to find out if there are meaningful differences between the two groups. A statistical analysis determines how likely any difference found is due to chance (and thus not meaningful). In psychology, group differences are considered meaningful, or significant, if the odds that these differences occurred by chance alone are 5 percent or less. Stated another way, if we repeated this...

Sampling Plans

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

Statistical Analysis: Overview

Statistical Analysis: Overview

When we take repeated measurements on the same or replicated samples, we will observe inconsistencies in the magnitude. These inconsistencies are called errors. To categorize and characterize these results and their errors, the researcher can use statistical analysis to determine the quality of the measurements and/or suitability of the methods.
One of the most commonly used statistical quantifiers is the mean, which is the ratio between the sum of the numerical values of all results and the...

One-Way ANOVA: Unequal Sample Sizes

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A digitally embedded intervention to enhance psychological decentering and reduce depression severity in at-risk adolescents: a randomised controlled trial of the 'One Step Back' programme.

EClinicalMedicine·2026

Same author

Embracing the suboptimal organization of the human brain.

Trends in cognitive sciences·2026

Same author

Canonical neurodevelopmental trajectories of structural and functional manifolds.

eLife·2026

Same author

Computational phenotyping of effort-based decision-making in type-2 diabetes on and off semaglutide.

Neuropsychopharmacology : official publication of the American College of Neuropsychopharmacology·2026

Same author

Individual Brain Charting: fifth release of high-resolution fMRI data for cognitive mapping.

Scientific data·2026

Same author

Differential effects of transcranial direct current stimulation on depression symptom clusters.

Journal of affective disorders·2026

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 21, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Statistical power for cluster analysis.

Edwin S Dalmaijer¹, Camilla L Nord², Duncan E Astle²

¹MRC Cognition and Brain Sciences Unit, University of Cambridge, 15 Chaucer Road, Cambridge, CB2 7EF, UK. edwin.dalmaijer@bristol.ac.uk.

BMC Bioinformatics

|June 1, 2022

Summary

This summary is machine-generated.

Statistical power for cluster analysis is crucial for identifying subgroups in biomedical data. Sufficient power is achieved with small samples (N=20) and large effect sizes (separation Δ=4), with fuzzy clustering offering advantages for overlapping distributions.

Keywords:

Cluster analysis Covariance Dimensionality reduction Effect size Latent class analysis Latent profile analysis Sample size Simulation Statistical power

More Related Videos

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Published on: June 20, 2020

Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI

Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI

Published on: November 27, 2019

Related Experiment Videos

Last Updated: Sep 21, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Measuring the Functional Abilities of Children Aged 3-6 Years Old with Observational Methods and Computer Tools

Published on: June 20, 2020

Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI

Meta-analysis of Voxel-Based Neuroimaging Studies using Seed-based d Mapping with Permutation of Subject Images SDM-PSI

Published on: November 27, 2019

Area of Science:

Biomedical data analysis
Statistical modeling
Machine learning

Background:

Cluster algorithms are increasingly used in biomedical research for subgroup identification.
Established methods for computing a priori statistical power for cluster analysis are lacking.
This study simulates common analysis pipelines to estimate statistical power and classification accuracy.

Purpose of the Study:

To estimate statistical power and classification accuracy for common cluster analysis pipelines via simulation.
To systematically investigate the influence of subgroup size, number, separation (effect size), and covariance structure.
To compare the statistical power of discrete, fuzzy, and finite mixture modeling approaches.

Main Methods:

Generated datasets with varied subgroup characteristics (size, number, separation, covariance).
Applied dimensionality reduction techniques (none, multi-dimensional scaling, UMAP).
Utilized clustering algorithms (k-means, hierarchical clustering, HDBSCAN) and compared discrete, fuzzy, and mixture models.

Main Results:

Clustering outcomes depend on large effect sizes or cumulative smaller effects, with minimal impact from covariance structure.
Adequate statistical power achieved with small samples (N=20 per subgroup) if cluster separation is large (Δ=4).
Fuzzy clustering demonstrates higher power and parsimony for identifying separable multivariate normal distributions, especially with moderate separation (Δ=3).

Conclusions:

Statistical power in cluster analysis is critically dependent on effect size (subgroup separation), not solely sample size.
Power is satisfactory only for large effect sizes with popular dimensionality reduction and clustering algorithms.
Recommendations include applying cluster analysis for large expected subgroup separation, aiming for N=20-30 per subgroup, using multi-dimensional scaling, and considering fuzzy clustering or mixture models for partially overlapping distributions.