Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Multiple Comparison Tests01:13

Multiple Comparison Tests

4.0K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.0K
One-Way ANOVA: Unequal Sample Sizes01:15

One-Way ANOVA: Unequal Sample Sizes

5.9K
One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:
5.9K
Self-Report Tests of Personality01:22

Self-Report Tests of Personality

450
Self-report inventories are objective personality assessments that use multiple-choice items or numbered scales, typically ranging from 1 (strongly disagree) to 5 (strongly agree). They are often called Likert scales after Rensis Likert. These inventories are widely used due to their ease of administration and cost-effectiveness. One of the most prominent examples is the Minnesota Multiphasic Personality Inventory (MMPI), initially developed in the 1940s to assess abnormal personality traits.
450
Testing a Claim about Standard Deviation01:19

Testing a Claim about Standard Deviation

2.5K
A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...
2.5K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.9K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.9K
Sample Size Calculation01:19

Sample Size Calculation

3.8K
Knowledge of the sample size is the first requirement to conduct random sampling or an experiment. The sample size is the total number of units, observations, or groups (in some cases) used to get the data to estimate a population parameter. As the name suggests, the sample size is that of the sample drawn from the population and differs from the population size.
The sample size for the given experiment or sampling effort is fundamental to any study design. Sample size decides the number of...
3.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Examining Differential Rater Functioning and Bias in the Holistic Review of Residency Applications.

Journal of graduate medical education·2026
Same author

Mental Rotation Performance: Contribution of Item Features to Difficulties and Functional Adaptation.

Journal of Intelligence·2025
Same author

Exploring the Impact of Missing Data on Residual-Based Dimensionality Analysis for Measurement Models.

Educational and psychological measurement·2023
Same author

Comparing Person-Fit and Traditional Indices Across Careless Response Patterns in Surveys.

Applied psychological measurement·2023
Same author

Does Sparseness Matter? Examining the Use of Generalizability Theory and Many-Facet Rasch Measurement in Sparse Rating Designs.

Applied psychological measurement·2023
Same author

Detecting Rating Scale Malfunctioning With the Partial Credit Model and Generalized Partial Credit Model.

Educational and psychological measurement·2023
Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026
Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026
Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026
Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026
Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Sep 6, 2025

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities
10:26

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

4.0K

Identifying Problematic Item Characteristics With Small Samples Using Mokken Scale Analysis.

Stefanie A Wind1

  • 1The University of Alabama, Tuscaloosa, AL, USA.

Educational and Psychological Measurement
|June 27, 2022
PubMed
Summary
This summary is machine-generated.

Mokken scale analysis (MSA) is suitable for small samples (around 100 examinees) when assessing item quality. Researchers should consider multiple indicators to ensure reliable measurement, especially with limited latent variable ranges.

Keywords:
Mokken scale analysisnonparametric IRTsmall samplessurvey research

More Related Videos

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.0K
Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA
10:58

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA

Published on: August 28, 2021

4.6K

Related Experiment Videos

Last Updated: Sep 6, 2025

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities
10:26

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

4.0K
Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.0K
Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA
10:58

Multimedia Battery for Assessment of Cognitive and Basic Skills in Mathematics BM-PROMA

Published on: August 28, 2021

4.6K

Area of Science:

  • Psychometrics
  • Statistical analysis
  • Item response theory

Background:

  • Mokken scale analysis (MSA) is a nonparametric item response theory approach often used with small sample sizes.
  • Existing guidance on minimum sample size for MSA has not addressed item-level issues like monotonicity or invariant item ordering (IIO) within specific latent variable ranges.
  • Previous research focused on sample-wide problems, neglecting item-specific challenges in limited contexts.

Purpose of the Study:

  • To investigate the sensitivity of Mokken scale analysis (MSA) item analysis procedures to problematic item characteristics within restricted latent variable ranges.
  • To evaluate the performance of MSA under conditions relevant to small sample sizes and localized item issues.
  • To provide empirical evidence on the reliability of MSA item analysis in specific measurement scenarios.

Main Methods:

  • A simulation study was employed to systematically generate data reflecting various item characteristics and sample sizes.
  • The study focused on item-level measurement properties, including potential violations of monotonicity and invariant item ordering (IIO).
  • The sensitivity of MSA procedures was analyzed across different simulated conditions, particularly those involving limited ranges of the latent variable.

Main Results:

  • Mokken scale analysis (MSA) procedures demonstrated general robustness with small sample sizes, specifically around N=100 examinees.
  • The analysis indicated that considering multiple indicators of item quality is crucial for reliable results, even with small samples.
  • Problematic item characteristics occurring within limited latent variable ranges showed varying degrees of impact on MSA outcomes.

Conclusions:

  • Mokken scale analysis (MSA) can be reliably used with small sample sizes (N ≈ 100) when appropriate item quality indicators are evaluated.
  • Researchers should exercise caution and employ multiple item fit statistics when applying MSA, especially when item performance is confined to specific ranges of the underlying construct.
  • The findings support the practical application of MSA in research settings with limited participant numbers, provided a thorough assessment of item characteristics is conducted.