Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Reliability and Validity01:29

Reliability and Validity

13.6K
Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.
13.6K
Wilcoxon Rank-Sum Test01:21

Wilcoxon Rank-Sum Test

606
The Wilcoxon rank-sum test, also known as the Mann-Whitney U test, is a nonparametric test used to determine if there is a significant difference between the distributions of two independent samples. This test is designed specifically for two independent populations and has the following key requirements:
606
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

6.8K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
6.8K
Test for Homogeneity01:23

Test for Homogeneity

2.3K
The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...
2.3K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

441
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
441
Ordinal Level of Measurement00:55

Ordinal Level of Measurement

31.6K
The way a set of data is measured is called its level of measurement. Correct statistical procedures depend on a researcher being familiar with levels of measurement. For analysis, data are classified into four levels of measurement—nominal, ordinal, interval, and ratio.
Data measured using an ordinal scale are similar to nominal scale data, but there is one major difference. The ordinal scale data can be ordered. An example of ordinal scale data is a list of the top five national parks...
31.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Unveiling Undergraduate Research: Employing Ecological Momentary Assessment to Characterize and Compare Undergraduate Research Experiences.

CBE life sciences education·2025
Same author

On the Use of Elbow Plot Method for Class Enumeration in Factor Mixture Models.

Applied psychological measurement·2025
Same author

An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models.

Educational and psychological measurement·2024
Same author

Sequential Bayesian Ability Estimation Applied to Mixed-Format Item Tests.

Applied psychological measurement·2023
Same author

Exploring examinees' responses to constructed response items with a supervised topic model.

The British journal of mathematical and statistical psychology·2023
Same author

The Impact of Sample Size and Various Other Factors on Estimation of Dichotomous Mixture IRT Models.

Educational and psychological measurement·2023
Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026
Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026
Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026
Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026
Same journal

Automatic Item Generation Measurement Models Respecting the Stochastic Sampling Space for Cross-Classified and Two-Level Sampling of Subjects and Incidentals.

Applied psychological measurement·2026
Same journal

Multistage Testing for Cognitive Diagnosis Based on Skill-Space Partitioning.

Applied psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Dec 28, 2025

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA
10:58

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA

Published on: August 28, 2021

4.9K

Reliability for Tests With Items Having Different Numbers of Ordered Categories.

Seohyun Kim1, Zhenqiu Lu1, Allan S Cohen1

  • 1University of Georgia, Athens, USA.

Applied Psychological Measurement
|February 21, 2020
PubMed
Summary
This summary is machine-generated.

A new structural equation modeling (SEM) approach enhances reliability analysis for tests with varied ordered categories. This method proves accurate, closely matching population reliability across diverse conditions.

Keywords:
categorical datareliabilitystructural equation modeling

More Related Videos

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

1.1K
Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

6.2K

Related Experiment Videos

Last Updated: Dec 28, 2025

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA
10:58

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA

Published on: August 28, 2021

4.9K
Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

1.1K
Computerized Adaptive Testing System of Functional Assessment of Stroke
05:21

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

6.2K

Area of Science:

  • Psychometrics
  • Statistical Modeling
  • Educational Measurement

Background:

  • Traditional reliability coefficients like coefficient alpha have limitations with items having varying numbers of ordered categories.
  • Assessing the reliability of tests with mixed-category items requires advanced statistical approaches.

Purpose of the Study:

  • To introduce and evaluate a novel structural equation modeling (SEM) approach for estimating reliability in tests with items having different numbers of ordered categories.
  • To compare the performance of the proposed SEM reliability coefficient against coefficient alpha and population reliability.

Main Methods:

  • Structural Equation Modeling (SEM) was employed to develop a new reliability coefficient.
  • A simulation study was conducted to compare reliability coefficients under various conditions, including different numbers of ordered categories, one-factor and bifactor structures, and score skewness.
  • An empirical example using a test with dichotomous and trichotomous items was analyzed.

Main Results:

  • The proposed SEM reliability coefficient demonstrated strong performance, closely approximating population reliability across most simulated conditions.
  • The simulation results provided insights into the behavior of different reliability coefficients under varying psychometric properties.
  • The empirical example highlighted the practical application and performance differences of the coefficients.

Conclusions:

  • The proposed SEM-based reliability approach is a viable and accurate method for tests with items having different numbers of ordered categories.
  • This study offers a valuable tool for researchers and practitioners needing to assess reliability in complex measurement instruments.
  • The findings underscore the importance of using appropriate reliability estimation methods tailored to item characteristics.