Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Comparing Experimental Results: Student's t-Test

Comparing Experimental Results: Student's t-Test

The t-test is a statistical method used to compare the sample mean with a population mean or compare two means from two data sets. The test statistic is calculated from the standard deviation, mean, and number of measurements in the data set at a selected confidence interval and then compared to a table of critical values at this confidence level. If the test statistic is smaller than the critical value, the null hypothesis is accepted. In this case, we state that the difference between the...

Identifying Statistically Significant Differences: The F-Test

Identifying Statistically Significant Differences: The F-Test

The F-test is used to compare two sample variances to each other or compare the sample variance to the population variance. It is used to decide whether an indeterminate error can explain the difference in their values. The underlying assumptions that allow the use of the F-test include the data set or sets are normally distributed, and the data sets are independent of each other. The test statistic F is calculated by dividing one variance by another. In other words, the square of one standard...

The Anderson-Darling Test

The Anderson-Darling Test

The Anderson-Darling test is a statistical method used to determine whether a data sample is likely drawn from a specific theoretical distribution. Unlike parametric tests, it does not require assumptions about specific parameters of the distribution. Instead, it compares the sample's empirical cumulative distribution function (ECDF) with the cumulative distribution function (CDF) of the hypothesized distribution. Critical values for the test are specific to the chosen distribution rather...

Test for Homogeneity

Test for Homogeneity

The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...

Testing a Claim about Standard Deviation

Testing a Claim about Standard Deviation

A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Conditional Reliability of Weighted Test Scores on a Bounded <i>D</i>-Scale.

Educational and psychological measurement·2025

Same author

The Dominant Trait Profile Method of Scoring Multidimensional Forced-Choice Questionnaires.

Educational and psychological measurement·2025

Same author

Latent <i>D</i>-Scoring Modeling: Estimation of Item and Person Parameters.

Educational and psychological measurement·2023

Same author

The Response Vector for Mastery Method of Standard Setting.

Educational and psychological measurement·2022

Same author

Modeling of Item Response Functions Under the <i>D</i>-Scoring Method.

Educational and psychological measurement·2020

Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026

Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026

Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026

Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026

Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 7, 2025

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

Testing for Differential Item Functioning Under the D-Scoring Method.

Dimiter M Dimitrov^1,2, Dimitar V Atanasov³

¹George Mason University, Fairfax, VA, USA.

Educational and Psychological Measurement

|January 7, 2022

Summary

This summary is machine-generated.

The P-Z method efficiently tests for differential item functioning (DIF) using the D-scoring method (DSM). This approach simplifies DIF testing by comparing Z-scale normal deviates, proving effective for both item and test functioning.

Keywords:

D-scoring method differential item functioning test bias

More Related Videos

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Published on: March 14, 2018

Related Experiment Videos

Last Updated: Oct 7, 2025

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Published on: March 14, 2018

Area of Science:

Psychometrics
Educational Measurement
Statistical Analysis

Background:

Differential Item Functioning (DIF) is crucial for unbiased measurement.
Existing methods may lack efficiency or applicability in certain frameworks.
The D-scoring method (DSM) offers a novel measurement framework.

Purpose of the Study:

To introduce and evaluate the P-Z method for DIF testing within the DSM framework.
To assess the statistical efficiency and power of the P-Z method.
To extend the P-Z method's applicability to differential test functioning.

Main Methods:

The P-Z method transforms item response probabilities (estimated via DSM) into Z-scale normal deviates.
DIF testing is reduced to comparing variances and means of these Z-deviates between reference and focal groups.
A simulation study was conducted to evaluate method performance.

Main Results:

The P-Z method demonstrated high efficiency, characterized by low Type I error rates and high statistical power.
The simulation results support the effectiveness of the P-Z method.
The P-Z method is directly applicable for detecting differential test functioning.

Conclusions:

The P-Z method provides an efficient and robust approach to DIF testing within the DSM framework.
The method is suitable for both item-level and test-level DIF analysis.
Recommendations for practical implementation and future research, including Item Response Theory (IRT) applications, are discussed.