Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Self-Report Tests of Personality01:22

Self-Report Tests of Personality

450
Self-report inventories are objective personality assessments that use multiple-choice items or numbered scales, typically ranging from 1 (strongly disagree) to 5 (strongly agree). They are often called Likert scales after Rensis Likert. These inventories are widely used due to their ease of administration and cost-effectiveness. One of the most prominent examples is the Minnesota Multiphasic Personality Inventory (MMPI), initially developed in the 1940s to assess abnormal personality traits.
450
Cattell's 16 Personality Factors01:24

Cattell's 16 Personality Factors

1.3K
Raymond Cattell's trait theory offers a structured framework for understanding personality by distinguishing between two critical traits: surface and source traits. Surface traits are observable patterns of behavior, such as indecisiveness, anxiety, and irrational fears. These traits are less stable, varying across situations and over time. This means that they are less helpful in understanding the deeper aspects of an individual's personality.
In contrast, source traits are the...
1.3K
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

3.9K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
3.9K
Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test01:09

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

1.8K
In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...
1.8K
Reliability and Validity01:29

Reliability and Validity

13.2K
Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
13.2K
Five-Factor Theory of Personality01:29

Five-Factor Theory of Personality

1.2K
The five-factor model, often called the Big Five personality traits, is widely accepted in psychology as a comprehensive framework for understanding personality. These five traits — Openness, Conscientiousness, Extraversion, Agreeableness, and Neuroticism — are often remembered using the acronym OCEAN.
Openness reflects creativity, curiosity, and openness to new experiences. Individuals scoring high in openness are imaginative, have a wide range of interests, and are independent...
1.2K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Long-term cross-variant Fc-mediated immune responses against SARS-CoV-2 induced by a heterologous adenoviral/inactivated virus prime-boost vaccination strategy.

NPJ vaccines·2026
Same author

Incremental diagnostic value of multiregional single-slice CT muscle areas over L3 for sarcopenia: a deep learning-based segmentation study.

Skeletal radiology·2026
Same author

RIM Hand: A Robotic Hand with an Accurate Carpometacarpal Joint and Nitinol-Supported Skeletal Structure.

Soft robotics·2026
Same author

Dual Targeting of HIF-1α and DLL4 by Isoxanthohumol Potentiates Immune Checkpoint Blockade.

International journal of molecular sciences·2026
Same author

Latent Poisson count models for action count data from technology-enhanced assessments.

The British journal of mathematical and statistical psychology·2026
Same author

Photocatalytic construction of <i>N</i>-acyl-<i>N</i>,<i>O</i>-acetal-linked pyridines <i>via</i> aminocyclopropane ring opening.

Chemical science·2026
Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026
Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026
Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026
Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026
Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Sep 6, 2025

Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

5.9K

Polytomous Testlet Response Models for Technology-Enhanced Innovative Items: Implications on Model Fit and Trait

Hyeon-Ah Kang1, Suhwa Han1, Doyoung Kim2

  • 1University of Texas at Austin, Austin, TX, USA.

Educational and Psychological Measurement
|June 27, 2022
PubMed
Summary
This summary is machine-generated.

This study compares four measurement models for polytomous testlet items. Generalized partial credit model (GPCM) and fixed-effect testlet model (FTM) are best for no or fixed testlet effects.

Keywords:
innovative itemsitem response theorypolytomous itemstechnology-enhanced assessmenttestlet

More Related Videos

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups
14:14

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

6.0K
Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.0K

Related Experiment Videos

Last Updated: Sep 6, 2025

Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

5.9K
The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups
14:14

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

6.0K
Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.0K

Area of Science:

  • Psychometrics
  • Educational Measurement
  • Statistical Modeling

Background:

  • Innovative items in technology-enhanced assessments often feature polytomous scoring within testlets.
  • Existing measurement models may not adequately capture the complexities of these polytomous testlet items.
  • Practical models are needed to accurately describe and score such items.

Purpose of the Study:

  • To evaluate and compare the performance of four measurement models for polytomous items administered in testlets.
  • To identify the most suitable model based on criteria such as model fit, parameter recovery, and classification accuracy.
  • To provide guidelines for selecting appropriate measurement models for polytomous testlet items.

Main Methods:

  • Comparison of four models: generalized partial credit model (GPCM), testlet-as-a-polytomous-item model (TPIM), random-effect testlet model (RTM), and fixed-effect testlet model (FTM).
  • Empirical evaluation using data simulating GPCM, FTM, and RTM conditions.
  • Assessment of models based on relative and absolute fit, testlet effect significance, parameter recovery, and classification accuracy.

Main Results:

  • Model performance varied significantly based on testlet effect type, size, and trait estimator.
  • GPCM and FTM yielded optimal results when testlets had no or fixed effects.
  • RTM showed the best model fit for random interaction effects but performance varied with trait estimation methods.
  • The effectiveness of RTM was most apparent with strong random effects and Bayesian estimation.

Conclusions:

  • The choice of measurement model for polytomous testlet items depends critically on the nature and magnitude of testlet effects.
  • Simpler models like GPCM and FTM are often sufficient and perform comparably or better in many scenarios.
  • Polytomous scoring of testlet items has limited practical utility.
  • Guidelines are provided for selecting the most appropriate measurement model for polytomous innovative items in testlet-based assessments.