Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Ratio Level of Measurement00:54

Ratio Level of Measurement

20.4K
The way a set of data is measured is called its level of measurement. Correct statistical procedures depend on a researcher being familiar with levels of measurement. For analysis, data are classified into four levels of measurement—nominal, ordinal, interval, and ratio.
A set of data measured using the ratio scale takes care of the ratio problem and provides complete information. Ratio scale data are like interval scale data, except they have a zero point and ratios can be calculated....
20.4K
Wilcoxon Rank-Sum Test01:21

Wilcoxon Rank-Sum Test

567
The Wilcoxon rank-sum test, also known as the Mann-Whitney U test, is a nonparametric test used to determine if there is a significant difference between the distributions of two independent samples. This test is designed specifically for two independent populations and has the following key requirements:
567
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.7K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.7K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.4K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.4K
Testing a Claim about Standard Deviation01:19

Testing a Claim about Standard Deviation

2.8K
A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...
2.8K
Measures of Intelligence01:29

Measures of Intelligence

8.1K
Psychologists measure intelligence by using standardized tests that produce a score known as the intelligence quotient or IQ. To understand IQ tests, it's important to recognize the key principles behind their construction: validity, reliability, and standardization.
Validity refers to how well a test measures what it claims to measure. An intelligence test should accurately assess intelligence rather than another characteristic, like anxiety. Criterion validity is one way to evaluate this;...
8.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Using Deep Learning to Choose Optimal Smoothing Values for Equating.

Applied psychological measurement·2025
Same author

A Seed Usage Issue on Using catR for Simulation and the Solution.

Applied psychological measurement·2020
Same author

On a New Algorithm for Removing Repeating Patterns in Similarity Analysis.

Educational and psychological measurement·2020
Same journal

The EM Algorithm and Its Variants in Cognitive Diagnostic Models: Comparing Their Propensity for Boundaries, Extremes, Convergence, and Suboptimal Solutions.

Applied psychological measurement·2026
Same journal

When Perceptions of Social Desirability Differ: Implications for the Multidimensional Nominal Response Model of Faking.

Applied psychological measurement·2026
Same journal

csemGT: An R Package for Estimating Raw-Score Conditional Standard Errors of Measurement in Generalizability Theory.

Applied psychological measurement·2026
Same journal

Confirmatory Factor Analysis with Adaptive Quadrature Estimator Using Four Link Functions.

Applied psychological measurement·2026
Same journal

Automatic Item Generation Measurement Models Respecting the Stochastic Sampling Space for Cross-Classified and Two-Level Sampling of Subjects and Incidentals.

Applied psychological measurement·2026
Same journal

Multistage Testing for Cognitive Diagnosis Based on Skill-Space Partitioning.

Applied psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Dec 18, 2025

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.2K

Evaluating Robust Scale Transformation Methods With Multiple Outlying Common Items Under IRT True Score Equating.

Yong He1, Zhongmin Cui1

  • 1ACT, Inc., Iowa City, IA, USA.

Applied Psychological Measurement
|June 16, 2020
PubMed
Summary
This summary is machine-generated.

This study shows robust scale transformation methods effectively handle multiple outlier common items in test equating. These methods reduce outlier impact while maintaining content balance, improving test accuracy.

Keywords:
equatingitem response theorymultiple outliersrobust scale transformation

More Related Videos

Qualitative and Quantitative Validation of Tools with Rating Scales Aimed at Assessing the Quality of University Service-Learning
10:39

Qualitative and Quantitative Validation of Tools with Rating Scales Aimed at Assessing the Quality of University Service-Learning

Published on: August 29, 2025

902
Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers
09:16

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Published on: March 14, 2018

10.6K

Related Experiment Videos

Last Updated: Dec 18, 2025

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.2K
Qualitative and Quantitative Validation of Tools with Rating Scales Aimed at Assessing the Quality of University Service-Learning
10:39

Qualitative and Quantitative Validation of Tools with Rating Scales Aimed at Assessing the Quality of University Service-Learning

Published on: August 29, 2025

902
Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers
09:16

Use of a Video Scoring Anchor for Rapid Serial Assessment of Social Communication in Toddlers

Published on: March 14, 2018

10.6K

Area of Science:

  • Educational Measurement
  • Psychometrics
  • Statistics

Background:

  • Common item parameter estimates can change abnormally due to item overexposure or curriculum shifts.
  • Outlier common items deviate from the expected pattern of normally behaving common items.
  • Eliminating outliers improves equating accuracy but can disrupt content balance.

Purpose of the Study:

  • To examine the performance of robust scale transformation methods with multiple outlier common items.
  • To assess the effectiveness of these methods in reducing outlier influence on scale transformation and equating.
  • To compare robust methods against traditional outlier detection and elimination techniques.

Main Methods:

  • Simulation study design.
  • Application of robust scale transformation methods.
  • Analysis of multiple outlying common items.

Main Results:

  • Robust scale transformation methods successfully reduced the influence of multiple outliers on scale transformation and equating.
  • The robust methods demonstrated comparable performance to traditional outlier detection and elimination methods.
  • Adequate content balance was maintained when using robust methods.

Conclusions:

  • Robust scale transformation methods are effective for addressing multiple outlier common items in test equating.
  • These methods offer a viable alternative to traditional outlier handling, balancing accuracy and content integrity.