Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Test for Homogeneity01:23

Test for Homogeneity

The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can be stated as...
Multiple Comparison Tests01:13

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
The Anderson-Darling Test01:16

The Anderson-Darling Test

The Anderson-Darling test is a statistical method used to determine whether a data sample is likely drawn from a specific theoretical distribution. Unlike parametric tests, it does not require assumptions about specific parameters of the distribution. Instead, it compares the sample's empirical cumulative distribution function (ECDF) with the cumulative distribution function (CDF) of the hypothesized distribution. Critical values for the test are specific to the chosen distribution rather than...
Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test01:09

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with data...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Detecting Test Speededness Using Responses and/or Response Times: Change Point Analysis Approaches Based on Schwarz Information Criterion.

Psychometrika·2026
Same author

Using multilabel classification neural network to detect intersectional DIF with small sample sizes.

The British journal of mathematical and statistical psychology·2026
Same author

A multi-strategy cognitive diagnosis model based on response times and fixation counts.

Behavior research methods·2026
Same author

A Diagnostic Facet Status Model (DFSM) for Extracting Instructionally Useful Information from Diagnostic Assessment.

Psychometrika·2026
Same author

Calibrating Multidimensional Assessments With Structural Missingness: An Application of a Multiple-Group Higher-Order IRT Model.

Applied psychological measurement·2026
Same author

Robot-Assisted Dynamic Interaction of Hemiplegic Upper Limbs with Complex Objects Based on Enhanced Feedforward-Impedance Control.

Biomimetics (Basel, Switzerland)·2025
Same journal

babebi: An R Package for Bayesian Estimation and Validation in Small-N Two-Rater Pre-Post Designs.

Applied psychological measurement·2026
Same journal

A Tool for Agreement and Alignment Analysis in Binary Rating Tasks: The R Package scindex.

Applied psychological measurement·2026
Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026
Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026
Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026
Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026
See all related articles

Related Experiment Video

Updated: May 14, 2026

A Tactile Automated Passive-Finger Stimulator TAPS
19:44

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

13.7K

Detecting uniform differential item functioning for continuous response computerized adaptive testing.

Chun Wang1, Ruoyi Zhu1

  • 1University of Washington, WA, USA.

Applied Psychological Measurement
|February 8, 2024
PubMed
Summary
This summary is machine-generated.

We developed two methods to detect differential item functioning (DIF) in computerized adaptive testing (CAT) with continuous responses and sparse data. Both methods effectively identified uniform DIF, ensuring fair measurement in advanced testing scenarios.

Keywords:
SIBTESTcomputerized adaptive testcontinuous responsedifferential item functioning

More Related Videos

Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

5.8K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

768

Related Experiment Videos

Last Updated: May 14, 2026

A Tactile Automated Passive-Finger Stimulator TAPS
19:44

A Tactile Automated Passive-Finger Stimulator TAPS

Published on: June 3, 2009

13.7K
Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

5.8K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

768

Area of Science:

  • Psychometrics
  • Educational Measurement
  • Computerized Adaptive Testing (CAT)

Background:

  • Ensuring measurement fairness requires evaluating items for differential item functioning (DIF).
  • Continuous response items offer more information than dichotomous items, particularly in performance-based tasks.
  • Severe data sparsity is common in computerized adaptive testing (CAT) when items are machine-generated.

Purpose of the Study:

  • To propose and evaluate two novel methods for detecting uniform DIF in the specific context of continuous response, severely sparse CAT.
  • To assess the effectiveness of these methods in identifying DIF under challenging data conditions.

Main Methods:

  • A modified non-parametric CAT-SIBTEST method, independent of item response theory (IRT) model assumptions.
  • A parametric, model-based regularization method.
  • Simulation studies were conducted to evaluate method performance.

Main Results:

  • Both proposed methods demonstrated effectiveness in accurately identifying items exhibiting uniform DIF.
  • The simulation studies confirmed the robustness of the developed techniques in the specified CAT scenario.

Conclusions:

  • The developed CAT-SIBTEST modification and regularization method are suitable for detecting uniform DIF in continuous response, severely sparse CAT.
  • These methods contribute to ensuring measurement fairness in advanced, data-intensive testing environments.
  • A real data analysis is presented to illustrate practical application and potential limitations.