Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Comparing Experimental Results: Student's t-Test01:09

Comparing Experimental Results: Student's t-Test

2.6K
The t-test is a statistical method used to compare the sample mean with a population mean or compare two means from two data sets. The test statistic is calculated from the standard deviation, mean, and number of measurements in the data set at a selected confidence interval and then compared to a table of critical values at this confidence level. If the test statistic is smaller than the critical value, the null hypothesis is accepted. In this case, we state that the difference between the...
2.6K
Identifying Statistically Significant Differences: The F-Test01:14

Identifying Statistically Significant Differences: The F-Test

2.6K
The F-test is used to compare two sample variances to each other or compare the sample variance to the population variance. It is used to decide whether an indeterminate error can explain the difference in their values. The underlying assumptions that allow the use of the F-test include the data set or sets are normally distributed, and the data sets are independent of each other. The test statistic F is calculated by dividing one variance by another. In other words, the square of one standard...
2.6K
The Anderson-Darling Test01:16

The Anderson-Darling Test

902
The Anderson-Darling test is a statistical method used to determine whether a data sample is likely drawn from a specific theoretical distribution. Unlike parametric tests, it does not require assumptions about specific parameters of the distribution. Instead, it compares the sample's empirical cumulative distribution function (ECDF) with the cumulative distribution function (CDF) of the hypothesized distribution. Critical values for the test are specific to the chosen distribution rather...
902
Test for Homogeneity01:23

Test for Homogeneity

2.1K
The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...
2.1K
Testing a Claim about Standard Deviation01:19

Testing a Claim about Standard Deviation

2.5K
A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...
2.5K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.5K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Conditional Reliability of Weighted Test Scores on a Bounded <i>D</i>-Scale.

Educational and psychological measurement·2025
Same author

The Dominant Trait Profile Method of Scoring Multidimensional Forced-Choice Questionnaires.

Educational and psychological measurement·2025
Same author

Latent <i>D</i>-Scoring Modeling: Estimation of Item and Person Parameters.

Educational and psychological measurement·2023
Same author

The Response Vector for Mastery Method of Standard Setting.

Educational and psychological measurement·2022
Same author

Modeling of Item Response Functions Under the <i>D</i>-Scoring Method.

Educational and psychological measurement·2020
Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026
Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026
Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026
Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026
Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Oct 7, 2025

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

948

Testing for Differential Item Functioning Under the D-Scoring Method.

Dimiter M Dimitrov1,2, Dimitar V Atanasov3

  • 1George Mason University, Fairfax, VA, USA.

Educational and Psychological Measurement
|January 7, 2022
PubMed
Summary
This summary is machine-generated.

The P-Z method efficiently tests for differential item functioning (DIF) using the D-scoring method (DSM). This approach simplifies DIF testing by comparing Z-scale normal deviates, proving effective for both item and test functioning.

Keywords:
D-scoring methoddifferential item functioningtest bias

More Related Videos

Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

6.0K
Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers
09:16

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Published on: March 14, 2018

10.4K

Related Experiment Videos

Last Updated: Oct 7, 2025

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

948
Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

6.0K
Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers
09:16

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Published on: March 14, 2018

10.4K

Area of Science:

  • Psychometrics
  • Educational Measurement
  • Statistical Analysis

Background:

  • Differential Item Functioning (DIF) is crucial for unbiased measurement.
  • Existing methods may lack efficiency or applicability in certain frameworks.
  • The D-scoring method (DSM) offers a novel measurement framework.

Purpose of the Study:

  • To introduce and evaluate the P-Z method for DIF testing within the DSM framework.
  • To assess the statistical efficiency and power of the P-Z method.
  • To extend the P-Z method's applicability to differential test functioning.

Main Methods:

  • The P-Z method transforms item response probabilities (estimated via DSM) into Z-scale normal deviates.
  • DIF testing is reduced to comparing variances and means of these Z-deviates between reference and focal groups.
  • A simulation study was conducted to evaluate method performance.

Main Results:

  • The P-Z method demonstrated high efficiency, characterized by low Type I error rates and high statistical power.
  • The simulation results support the effectiveness of the P-Z method.
  • The P-Z method is directly applicable for detecting differential test functioning.

Conclusions:

  • The P-Z method provides an efficient and robust approach to DIF testing within the DSM framework.
  • The method is suitable for both item-level and test-level DIF analysis.
  • Recommendations for practical implementation and future research, including Item Response Theory (IRT) applications, are discussed.