Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Standard Error of the Mean

Standard Error of the Mean

The sampling variability of a statistic is defined as how much the statistic varies from one sample to another. The sampling variability of a statistic is typically measured by measuring its standard error.
The standard error of the mean is an example of a standard error. It is a unique standard deviation known as the standard deviation of the sampling distribution of the mean. The standard error of the mean is a statistic that calculates how correctly a sample distribution represents a...

Testing a Claim about Mean: Unknown Population SD

Testing a Claim about Mean: Unknown Population SD

A complete procedure of testing a hypothesis about a population mean when the population standard deviation is unknown is explained here.
Estimating a population mean requires the samples to be approximately normally distributed. The data should be collected from the randomly selected samples having no sampling bias. There is no specific requirement for sample size. But if the sample size is less than 30, and we don't know the population standard deviation, a different approach is used;...

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

Behrens–Fisher Test

Behrens–Fisher Test

The Behrens-Fisher test is a statistical method designed to address the Behrens-Fisher problem, which arises when comparing the means of two normally distributed populations with unequal variances. Unlike the Student's t-test, which assumes equal variances, the Behrens-Fisher test allows for mean comparison without this restrictive assumption. This flexibility makes it particularly valuable in scenarios where two independent samples exhibit normality but lack variance homogeneity.
This test...

Introduction to the Sign Test

Introduction to the Sign Test

The sign test is an important tool in nonparametric statistics, offering a straightforward yet effective method for analyzing matched pairs, nominal data, or hypotheses concerning the median of a population. It transforms data points into positive or negative signs, avoiding the need for assumptions about data distribution and instead focusing on the direction of change. It is particularly valuable when data does not conform to the normal distribution requirements of many parametric tests. For...

Central Limit Theorem

Central Limit Theorem

The central limit theorem, abbreviated as clt, is one of the most powerful and useful ideas in all of statistics. The central limit theorem for sample means says that if you repeatedly draw samples of a given size and calculate their means, and create a histogram of those means, then the resulting histogram will tend to have an approximate normal bell shape. In other words, as sample sizes increase, the distribution of means follows the normal distribution more closely.
The sample size, n, that...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Correcting the Variance of Effect Sizes Based on Binary Outcomes for Clustering.

Educational and psychological measurement·2025

Same author

Effect sizes for experimental research.

The British journal of mathematical and statistical psychology·2025

Same author

ABkPowerCalculator: An App to Compute Power for Balanced (AB)<sup>k</sup> Single Case Experimental Designs.

Multivariate behavioral research·2023

Same author

Meta-analyzing individual participant data from studies with complex survey designs: A tutorial on using the two-stage approach for data from educational large-scale assessments.

Research synthesis methods·2022

Same author

Bayesian unknown change-point models to investigate immediacy in single case designs.

Psychological methods·2017

Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026

Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026

Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026

Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026

Same journal

The Anonymous Collection of Longitudinal Data: An Evaluation of Self-Generated Identification Codes and Methodological Challenges.

Educational and psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 7, 2025

Assessing the Accuracy of Fitness Smartwatch Data for Cardiovascular and Physical Activity Monitoring: A Validation Study in Digital Health

Assessing the Accuracy of Fitness Smartwatch Data for Cardiovascular and Physical Activity Monitoring: A Validation Study in Digital Health

Published on: February 21, 2025

Interpretation of the Standardized Mean Difference Effect Size When Distributions Are Not Normal or Homoscedastic.

Larry V Hedges¹

¹Northwestern University, Evanston, IL, USA.

Educational and Psychological Measurement

|November 18, 2024

Summary

This summary is machine-generated.

The standardized mean difference (Cohen's d) is a common effect size measure. Its interpretation as distribution overlap is reliable only for normally distributed data with equal variances.

Keywords:

Cohen’s d distribution overlap effect size

More Related Videos

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

Post-Movie Subliminal Measurement PMSM, for Investigating Implicit Social Bias

Post-Movie Subliminal Measurement PMSM, for Investigating Implicit Social Bias

Published on: February 29, 2020

Related Experiment Videos

Last Updated: Jun 7, 2025

Assessing the Accuracy of Fitness Smartwatch Data for Cardiovascular and Physical Activity Monitoring: A Validation Study in Digital Health

Assessing the Accuracy of Fitness Smartwatch Data for Cardiovascular and Physical Activity Monitoring: A Validation Study in Digital Health

Published on: February 21, 2025

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

Post-Movie Subliminal Measurement PMSM, for Investigating Implicit Social Bias

Post-Movie Subliminal Measurement PMSM, for Investigating Implicit Social Bias

Published on: February 29, 2020

Area of Science:

Statistics
Psychometrics
Data Analysis

Background:

The standardized mean difference (Cohen's d) is a prevalent effect size metric in experimental research.
It quantifies the difference between two group means relative to their variability.
Cohen's d is particularly intuitive for normally distributed data with equal variances.

Purpose of the Study:

To examine the reliability of Cohen's d as a measure of distribution overlap.
To investigate the impact of non-normality and unequal variances on Cohen's d interpretation.
To assess the conditions under which Cohen's d interpretations remain valid.

Main Methods:

The study theoretically analyzes the relationship between Cohen's d and distribution overlap.
It considers scenarios with non-normally distributed data.
It evaluates data with substantially unequal standard deviations.

Main Results:

The mathematical relationship between Cohen's d and distribution overlap is straightforward for normal distributions with equal variances.
Deviations from normality or equality of variances significantly alter the relationship between Cohen's d and distribution overlap.
Standard interpretations of Cohen's d become unreliable under these conditions.

Conclusions:

The interpretation of Cohen's d as an index of distribution overlap is contingent upon data meeting specific assumptions of normality and equal variances.
Researchers must exercise caution when interpreting Cohen's d in the presence of non-normal data or unequal variances.
Alternative effect size measures or interpretive frameworks may be necessary when standard assumptions are violated.