Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Decision Making: P-value Method

Decision Making: P-value Method

The process of hypothesis testing based on the P-value method includes calculating the P- value using the sample data and interpreting it.
First, a specific claim about the population parameter is proposed. The claim is based on the research question and is stated in a simple form. Further, an opposing statement to the claim is also stated. These statements can act as null and alternative hypotheses: a null hypothesis would be a neutral statement while the alternative hypothesis can...

P-value

P-value

P-value is one of the most crucial concepts in statistics.
P-value stands for the probability value. P-value is the probability that, if the null hypothesis is true, the results from another randomly selected sample will be as extreme or more extreme as the results obtained from the given sample.
A large P-value calculated from the data indicates to not reject the null hypothesis. But a higher P-value does not mean that the null hypothesis is true. The smaller the P-value, the more...

Bonferroni Test

Bonferroni Test

The Bonferroni test is a statistical test named after Carlo Emilio Bonferroni, an Italian mathematician best known for Bonferroni inequalities. This statistical test is a type of multiple comparison test to determine which means are different than the rest. Bonferroni test can minimize the Type 1 error by reducing the significance level alpha, which otherwise increases with sample pairs.
The means of different samples are first paired in all possible combinations.
The null hypothesis of the...

Accuracy and Errors in Hypothesis Testing

Accuracy and Errors in Hypothesis Testing

Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5%...

Identifying Statistically Significant Differences: The F-Test

Identifying Statistically Significant Differences: The F-Test

The F-test is used to compare two sample variances to each other or compare the sample variance to the population variance. It is used to decide whether an indeterminate error can explain the difference in their values. The underlying assumptions that allow the use of the F-test include the data set or sets are normally distributed, and the data sets are independent of each other. The test statistic F is calculated by dividing one variance by another. In other words, the square of one standard...

Fisher's Exact Test

Fisher's Exact Test

Fisher's exact test is a statistical significance test widely used to analyze 2x2 contingency tables, particularly in situations where sample sizes are small. Unlike the chi-squared test, which approximates P-values and assumes minimum expected frequencies of at least five in each cell, Fisher's exact test calculates the exact probability (P-value) of observing the data or more extreme results under the null hypothesis. This feature makes it especially valuable when the assumptions of...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

From Digital Data to Psychological Insights: Making Sense of Mobile-Sensing Data through Integrative Preprocessing Pipelines.

Psychometrika·2026

Same author

Investigating measurement invariance for multiple covariates in organizational research using exploratory factor analysis and confirmatory factor analysis trees.

The Journal of applied psychology·2026

Same author

Disclosure of mental illness towards employers during the return to work process after psychiatric hospitalization.

BMC psychiatry·2026

Same author

Detecting Model Misfit in Structural Equation Modeling with Machine Learning-A Proof of Concept.

Multivariate behavioral research·2025

Same author

Exploratory graph analysis trees-A network-based approach to investigate measurement invariance with numerous covariates.

Psychological methods·2025

Same author

Embrace the heterogeneity in exploratory factor analysis but be transparent about what you do-A commentary on Manapat et al. (2023).

Psychological methods·2025

Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026

Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026

Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026

Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026

Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 21, 2026

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Controlling the False Discovery Rate in DIF Detection With e-Values: Evidence From Multidimensional and Testlet

Shan Huang¹, David Goretzko²

¹Yew Chung Yew Wah Education Network, Hong Kong SAR, China.

Educational and Psychological Measurement

|April 20, 2026

Summary

This summary is machine-generated.

This study introduces e-value-based false discovery rate (FDR) control for Differential Item Functioning (DIF) detection. E-BH offers more stable error control than traditional p-value methods, especially with model violations and large sample sizes.

Keywords:

differential item functioning e-values false discovery rate model misspecification multiple testing

More Related Videos

An Integrated Workflow of Identification and Quantification on FDR Control-Based Untargeted Metabolome

An Integrated Workflow of Identification and Quantification on FDR Control-Based Untargeted Metabolome

Published on: September 20, 2022

Genome-wide Protein-protein Interaction Screening by Protein-fragment Complementation Assay PCA in Living Cells

Genome-wide Protein-protein Interaction Screening by Protein-fragment Complementation Assay PCA in Living Cells

Published on: March 3, 2015

Related Experiment Videos

Last Updated: Apr 21, 2026

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

An Integrated Workflow of Identification and Quantification on FDR Control-Based Untargeted Metabolome

An Integrated Workflow of Identification and Quantification on FDR Control-Based Untargeted Metabolome

Published on: September 20, 2022

Genome-wide Protein-protein Interaction Screening by Protein-fragment Complementation Assay PCA in Living Cells

Genome-wide Protein-protein Interaction Screening by Protein-fragment Complementation Assay PCA in Living Cells

Published on: March 3, 2015

Area of Science:

Psychometrics
Statistical modeling
Educational measurement

Background:

Traditional p-value methods for Differential Item Functioning (DIF) detection face limitations when statistical assumptions are violated.
Issues like multidimensionality, local item dependence, and extreme sample sizes can compromise the accuracy of p-value-based approaches.

Purpose of the Study:

To apply e-value-based false discovery rate (FDR) control for DIF detection, offering an alternative to p-value methods.
To evaluate the performance of e-BH procedures against classical methods under various model misspecification scenarios.

Main Methods:

Conducted two simulation studies evaluating K-fold and Multisplit likelihood-ratio e-values with e-BH procedures.
Simulated conditions included multidimensional contamination and testlet-based local dependence.
Applied e-BH to Progress in International Reading Literacy Study (PIRLS) data for empirical validation.

Main Results:

E-BH demonstrated superior and more stable control of Type I error, FDR, and family-wise error rate (FWER) compared to Benjamini-Hochberg (BH) and Holm procedures.
E-BH maintained lower false-positive rates even under severe model misspecification, with competitive Type II error rates.
Classical p-value methods showed increased Type I error with larger sample sizes, while e-BH maintained stable control due to model-agnostic calibration.

Conclusions:

E-value-based FDR control (e-BH) is a robust and effective tool for DIF detection in complex assessment contexts.
E-BH provides more reliable and sustainable DIF flagging than traditional p-value approaches, particularly when model assumptions are violated.
The model-agnostic calibration of e-BH ensures stable error control across varying sample sizes and model misspecifications.