Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Self-Report Tests of Personality

Self-Report Tests of Personality

Self-report inventories are objective personality assessments that use multiple-choice items or numbered scales, typically ranging from 1 (strongly disagree) to 5 (strongly agree). They are often called Likert scales after Rensis Likert. These inventories are widely used due to their ease of administration and cost-effectiveness. One of the most prominent examples is the Minnesota Multiphasic Personality Inventory (MMPI), initially developed in the 1940s to assess abnormal personality traits.

Cattell's 16 Personality Factors

Cattell's 16 Personality Factors

Raymond Cattell's trait theory offers a structured framework for understanding personality by distinguishing between two critical traits: surface and source traits. Surface traits are observable patterns of behavior, such as indecisiveness, anxiety, and irrational fears. These traits are less stable, varying across situations and over time. This means that they are less helpful in understanding the deeper aspects of an individual's personality.
In contrast, source traits are the...

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...

Reliability and Validity

Reliability and Validity

Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.

Five-Factor Theory of Personality

Five-Factor Theory of Personality

The five-factor model, often called the Big Five personality traits, is widely accepted in psychology as a comprehensive framework for understanding personality. These five traits — Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism — are often remembered using the acronym OCEAN.
Openness reflects creativity, curiosity, and openness to new experiences. Individuals scoring high in openness are imaginative, have a wide range of interests, and are independent...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Long-term cross-variant Fc-mediated immune responses against SARS-CoV-2 induced by a heterologous adenoviral/inactivated virus prime-boost vaccination strategy.

NPJ vaccines·2026

Same author

Incremental diagnostic value of multiregional single-slice CT muscle areas over L3 for sarcopenia: a deep learning-based segmentation study.

Skeletal radiology·2026

Same author

RIM Hand: A Robotic Hand with an Accurate Carpometacarpal Joint and Nitinol-Supported Skeletal Structure.

Soft robotics·2026

Same author

Dual Targeting of HIF-1α and DLL4 by Isoxanthohumol Potentiates Immune Checkpoint Blockade.

International journal of molecular sciences·2026

Same author

Latent Poisson count models for action count data from technology-enhanced assessments.

The British journal of mathematical and statistical psychology·2026

Same author

Photocatalytic construction of <i>N</i>-acyl-<i>N</i>,<i>O</i>-acetal-linked pyridines <i>via</i> aminocyclopropane ring opening.

Chemical science·2026

Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026

Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026

Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026

Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026

Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 6, 2025

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait

Hyeon-Ah Kang¹, Suhwa Han¹, Doyoung Kim²

¹University of Texas at Austin, Austin, TX, USA.

Educational and Psychological Measurement

|June 27, 2022

Summary

This summary is machine-generated.

This study compares four measurement models for polytomous testlet items. Generalized partial credit model (GPCM) and fixed-effect testlet model (FTM) are best for no or fixed testlet effects.

Keywords:

innovative items item response theory polytomous items technology-enhanced assessment testlet

More Related Videos

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

Related Experiment Videos

Last Updated: Sep 6, 2025

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

Area of Science:

Psychometrics
Educational Measurement
Statistical Modeling

Background:

Innovative items in technology-enhanced assessments often feature polytomous scoring within testlets.
Existing measurement models may not adequately capture the complexities of these polytomous testlet items.
Practical models are needed to accurately describe and score such items.

Purpose of the Study:

To evaluate and compare the performance of four measurement models for polytomous items administered in testlets.
To identify the most suitable model based on criteria such as model fit, parameter recovery, and classification accuracy.
To provide guidelines for selecting appropriate measurement models for polytomous testlet items.

Main Methods:

Comparison of four models: generalized partial credit model (GPCM), testlet-as-a-polytomous-item model (TPIM), random-effect testlet model (RTM), and fixed-effect testlet model (FTM).
Empirical evaluation using data simulating GPCM, FTM, and RTM conditions.
Assessment of models based on relative and absolute fit, testlet effect significance, parameter recovery, and classification accuracy.

Main Results:

Model performance varied significantly based on testlet effect type, size, and trait estimator.
GPCM and FTM yielded optimal results when testlets had no or fixed effects.
RTM showed the best model fit for random interaction effects but performance varied with trait estimation methods.
The effectiveness of RTM was most apparent with strong random effects and Bayesian estimation.

Conclusions:

The choice of measurement model for polytomous testlet items depends critically on the nature and magnitude of testlet effects.
Simpler models like GPCM and FTM are often sufficient and perform comparably or better in many scenarios.
Polytomous scoring of testlet items has limited practical utility.
Guidelines are provided for selecting the most appropriate measurement model for polytomous innovative items in testlet-based assessments.