Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Surveys

Surveys

Often, psychologists develop surveys as a means of gathering data. Surveys are lists of questions to be answered by research participants, and can be delivered as paper-and-pencil questionnaires, administered electronically, or conducted verbally. Generally, the survey itself can be completed in a short time, and the ease of administering a survey makes it easy to collect data from a large number of people.

Self-Report Tests of Personality

Self-Report Tests of Personality

Self-report inventories are objective personality assessments that use multiple-choice items or numbered scales, typically ranging from 1 (strongly disagree) to 5 (strongly agree). They are often called Likert scales after Rensis Likert. These inventories are widely used due to their ease of administration and cost-effectiveness. One of the most prominent examples is the Minnesota Multiphasic Personality Inventory (MMPI), initially developed in the 1940s to assess abnormal personality traits.

Factorial Design

Factorial Design

Factorial Analysis is an experimental design that applies Analysis of Variance (ANOVA) statistical procedures to examine a change in a dependent variable due to more than one independent variable, also known as factors. Changes in worker productivity can be reasoned, for example, to be influenced by salary and other conditions, such as skill level. One way to test this hypothesis is by categorizing salary into three levels (low, moderate, and high) and skills sets into two levels (entry level...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Response Surface Methodology

Response Surface Methodology

Response Surface Methodology (RSM) is a collection of statistical and mathematical techniques used to develop, improve, and optimize processes. It is particularly valuable when many input variables or factors potentially influence a response variable.
The process of RSM involves several key steps:

Ordinal Level of Measurement

Ordinal Level of Measurement

The way a set of data is measured is called its level of measurement. Correct statistical procedures depend on a researcher being familiar with levels of measurement. For analysis, data are classified into four levels of measurement—nominal, ordinal, interval, and ratio.
Data measured using an ordinal scale are similar to nominal scale data, but there is one major difference. The ordinal scale data can be ordered. An example of ordinal scale data is a list of the top five national parks...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Simulations in statistical workflows.

Philosophical transactions. Series A, Mathematical, physical, and engineering sciences·2026

Same author

Effects of perceived working alliance quality and congruence on naturalistic psychotherapy outcome: A response surface analysis.

Psychotherapy research : journal of the Society for Psychotherapy Research·2025

Same author

Sumatran orangutan mothers differ in the extent and trajectory of their expression of maternal behaviour.

Proceedings. Biological sciences·2025

Same author

A systematic review and meta-analyses of the temporal stability and convergent validity of risk preference measures.

Nature human behaviour·2025

Same author

A deep learning method for comparing Bayesian hierarchical models.

Psychological methods·2024

Same author

Smoking is associated with increased eryptosis, suicidal erythrocyte death, in a large population-based cohort.

Scientific reports·2024

Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026

Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026

Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026

Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026

Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026

Same journal

The Anonymous Collection of Longitudinal Data: An Evaluation of Self-Generated Identification Codes and Methodological Challenges.

Educational and psychological measurement·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 11, 2025

A Two-interval Forced-choice Task for Multisensory Comparisons

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

Can High-Dimensional Questionnaires Resolve the Ipsativity Issue of Forced-Choice Response Formats?

Niklas Schulte¹, Heinz Holling¹, Paul-Christian Bürkner²

¹University of Münster, Germany.

Educational and Psychological Measurement

|November 6, 2023

Summary

This summary is machine-generated.

Forced-choice questionnaires, while reducing response biases, often yield unreliable scores. Simulations show that even with many traits, both classical and Thurstonian IRT methods struggle with reliability and ipsativity in realistic scenarios.

Keywords:

Thurstonian IRT model forced-choice format ipsative data multidimensional IRT

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues

Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues

Published on: June 3, 2013

Related Experiment Videos

Last Updated: Jul 11, 2025

A Two-interval Forced-choice Task for Multisensory Comparisons

A Two-interval Forced-choice Task for Multisensory Comparisons

Published on: November 9, 2018

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues

Perceptual and Category Processing of the Uncanny Valley Hypothesis' Dimension of Human Likeness: Some Methodological Issues

Published on: June 3, 2013

Area of Science:

Psychometrics
Psychological Measurement
Quantitative Psychology

Background:

Forced-choice questionnaires mitigate response biases common in rating scales.
However, their scores are often unreliable and ipsative, hindering interindividual comparisons.
High dimensionality is proposed to resolve these issues.

Purpose of the Study:

To investigate the number of traits needed to overcome reliability and ipsativity issues in forced-choice questionnaires.
To compare classical scoring and Thurstonian item response theory (IRT) models under varying conditions.

Main Methods:

Computer simulations were conducted.
Varying parameters included sample size, factor loadings, and intertrait correlations.
Two scoring methods were examined: classical (ipsative) and Thurstonian IRT.

Main Results:

Thurstonian IRT models performed well under ideal conditions.
Both methods showed insufficient reliability in most realistic applied contexts.
Even with 30 traits, both classical and Thurstonian IRT scores remained partially ipsative.

Conclusions:

The assumption that high dimensionality resolves ipsativity in forced-choice questionnaires is questioned.
Results cast doubt on the interpretability of validation studies using ipsative scores from Thurstonian IRT models.
Reliability and comparability issues persist in practical applications of forced-choice questionnaires.