Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Cochran's Q Test

Cochran's Q Test

Cochran's Q Test is a nonparametric statistical test used to determine if there are potential differences in the outcomes of three or more related groups on a binary (yes/no) or dichotomous outcome. It is essentially an extension of the McNemar Test, which is limited to two related samples - Cochran's Q test can handle three or more related samples, making it more versatile in scenarios where subjects are measured under multiple conditions. The test statistic follows a Chi-Square...

Data Validation

Data Validation

Method validation is a crucial process in analytical chemistry designed to confirm that a given method consistently produces reliable and high-quality results. This process is essential when a method is applied to different sample matrices or when procedural modifications are made, ensuring that the results meet acceptable standards across various applications.
Key parameters for method validation include:

Quality Assurance

Quality Assurance

Quality assurance is the overarching term used to describe the activities employed to ensure the proper performance of a system. These activities can be classified into three categories: quality control, quality assessment, and internal corrective measures. Typically, these activities work cyclically: quality control is performed before and during the analysis, while quality assessment occurs during and after the investigation. Internal corrective measures are implemented based on the findings...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The Bots Ruining Social Science Are Not Bots at All.

Perspectives on psychological science : a journal of the Association for Psychological Science·2026

Same author

A survey of practicum training practices in clinical neuropsychology.

The Clinical neuropsychologist·2024

Same author

Time restricted eating and depression: a psychological perspective.

International journal of food sciences and nutrition·2024

Same author

The Improvement Default: People Presume Improvement When Lacking Information.

Personality & social psychology bulletin·2023

Same author

Is it ethical to use Mechanical Turk for behavioral research? Relevant data from a representative survey of MTurk participants and wages.

Behavior research methods·2023

Same author

Mother Nature's Fury: Antagonist Metaphors for Natural Disasters Increase Forecasts of Their Severity and Encourage Evacuation.

Science communication·2021

Same journal

Planned missingness in intensive longitudinal studies: Extensions and comparisons of multiform designs.

Behavior research methods·2026

Same journal

A validity-guided workflow for robust large language model research in psychology.

Behavior research methods·2026

Same journal

Are 7-point Likert scales preferable to 5-point scales in language research?

Behavior research methods·2026

Same journal

Generative psychometrics via AI-GENIE: Automatic item generation and validation with network-integrated evaluation.

Behavior research methods·2026

Same journal

Exploring psychological tradeoffs: Developing and demonstrating an R Shiny app for Pareto optimization.

Behavior research methods·2026

Same journal

The performance of Bayesian fit measures in detecting misspecified multilevel structural equation modeling.

Behavior research methods·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 23, 2025

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios

Published on: October 6, 2020

Evaluating CloudResearch's Approved Group as a solution for problematic data quality on MTurk.

David J Hauser¹, Aaron J Moss², Cheskie Rosenzweig^2,3

¹Department of Psychology, Queen's University, Kingston, ON, Canada. david.hauser@queensu.ca.

Behavior Research Methods

|November 3, 2022

Summary

This summary is machine-generated.

CloudResearch's vetting system effectively identifies high-quality Amazon Mechanical Turk (MTurk) workers. The "Approved" group provides significantly better data quality than the "Blocked" group or standard MTurk samples.

Keywords:

Data quality Participant recruitment Response bias Test validity

More Related Videos

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging

Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging

Published on: March 14, 2025

Related Experiment Videos

Last Updated: Aug 23, 2025

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios

Published on: October 6, 2020

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging

Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging

Published on: March 14, 2025

Area of Science:

Psychological Science
Human-Computer Interaction
Data Science

Background:

Maintaining data quality on Amazon Mechanical Turk (MTurk) is a persistent challenge for researchers.
Recent issues, including the 2018 bot crisis, have highlighted the inadequacy of traditional quality control measures like approval ratings.
CloudResearch has developed a vetting system to categorize MTurk workers based on data quality.

Purpose of the Study:

To evaluate the predictive validity of CloudResearch's worker vetting system.
To compare the data quality of MTurk workers categorized as "Approved" and "Blocked" by CloudResearch.
To assess the effectiveness of CloudResearch's vetting against standard MTurk samples.

Main Methods:

A pre-registered study involving 900 participants from CloudResearch's "Approved" and "Blocked" groups, plus a standard MTurk sample.
Participants completed various data-quality measures, including image identification, reading comprehension, attention checks, and consistency in responding to reversed items.
Performance was also assessed on easily searchable questions, replication of psychological effects, and AI-challenging questions.

Main Results:

"Approved" MTurk workers demonstrated superior performance across multiple data-quality indices compared to "Blocked" workers.
"Blocked" participants often performed at chance levels on several measures.
Standard MTurk samples generally fell between the "Approved" and "Blocked" groups in data quality.

Conclusions:

MTurk's standard approval rating is an unreliable indicator of data quality.
CloudResearch's "Approved" group offers a more dependable source of high-quality data for research.
Utilizing vetted worker pools can significantly enhance the reliability and validity of findings from MTurk studies.