Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.8K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.8K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.3K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.3K
Multiple Comparison Tests01:13

Multiple Comparison Tests

4.0K
Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...
4.0K
Cochran's Q Test01:17

Cochran's Q Test

500
Cochran's Q Test is a nonparametric statistical test used to determine if there are potential differences in the outcomes of three or more related groups on a binary (yes/no) or dichotomous outcome. It is essentially an extension of the McNemar Test, which is limited to two related samples - Cochran's Q test can handle three or more related samples, making it more versatile in scenarios where subjects are measured under multiple conditions. The test statistic follows a Chi-Square...
500
Data Validation01:15

Data Validation

213
Method validation is a crucial process in analytical chemistry designed to confirm that a given method consistently produces reliable and high-quality results. This process is essential when a method is applied to different sample matrices or when procedural modifications are made, ensuring that the results meet acceptable standards across various applications.
Key parameters for method validation include:
213
Quality Assurance01:19

Quality Assurance

179
Quality assurance is the overarching term used to describe the activities employed to ensure the proper performance of a system. These activities can be classified into three categories: quality control, quality assessment, and internal corrective measures. Typically, these activities work cyclically: quality control is performed before and during the analysis, while quality assessment occurs during and after the investigation. Internal corrective measures are implemented based on the findings...
179

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Bots Ruining Social Science Are Not Bots at All.

Perspectives on psychological science : a journal of the Association for Psychological Science·2026
Same author

A survey of practicum training practices in clinical neuropsychology.

The Clinical neuropsychologist·2024
Same author

Time restricted eating and depression: a psychological perspective.

International journal of food sciences and nutrition·2024
Same author

The Improvement Default: People Presume Improvement When Lacking Information.

Personality & social psychology bulletin·2023
Same author

Is it ethical to use Mechanical Turk for behavioral research? Relevant data from a representative survey of MTurk participants and wages.

Behavior research methods·2023
Same author

Mother Nature's Fury: Antagonist Metaphors for Natural Disasters Increase Forecasts of Their Severity and Encourage Evacuation.

Science communication·2021
Same journal

Planned missingness in intensive longitudinal studies: Extensions and comparisons of multiform designs.

Behavior research methods·2026
Same journal

A validity-guided workflow for robust large language model research in psychology.

Behavior research methods·2026
Same journal

Are 7-point Likert scales preferable to 5-point scales in language research?

Behavior research methods·2026
Same journal

Generative psychometrics via AI-GENIE: Automatic item generation and validation with network-integrated evaluation.

Behavior research methods·2026
Same journal

Exploring psychological tradeoffs: Developing and demonstrating an R Shiny app for Pareto optimization.

Behavior research methods·2026
Same journal

The performance of Bayesian fit measures in detecting misspecified multilevel structural equation modeling.

Behavior research methods·2026
See all related articles

Related Experiment Video

Updated: Aug 23, 2025

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios
06:02

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios

Published on: October 6, 2020

2.3K

Evaluating CloudResearch's Approved Group as a solution for problematic data quality on MTurk.

David J Hauser1, Aaron J Moss2, Cheskie Rosenzweig2,3

  • 1Department of Psychology, Queen's University, Kingston, ON, Canada. david.hauser@queensu.ca.

Behavior Research Methods
|November 3, 2022
PubMed
Summary
This summary is machine-generated.

CloudResearch's vetting system effectively identifies high-quality Amazon Mechanical Turk (MTurk) workers. The "Approved" group provides significantly better data quality than the "Blocked" group or standard MTurk samples.

Keywords:
Data qualityParticipant recruitmentResponse biasTest validity

More Related Videos

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.0K
Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging
09:14

Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging

Published on: March 14, 2025

314

Related Experiment Videos

Last Updated: Aug 23, 2025

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios
06:02

Evaluating Usability Aspects of a Mixed Reality Solution for Immersive Analytics in Industry 4.0 Scenarios

Published on: October 6, 2020

2.3K
Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.0K
Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging
09:14

Exploring the Neural Correlates of Cognitive Reappraisal in Obsessive-Compulsive Disorder Using Task-based Functional Magnetic Resonance Imaging

Published on: March 14, 2025

314

Area of Science:

  • Psychological Science
  • Human-Computer Interaction
  • Data Science

Background:

  • Maintaining data quality on Amazon Mechanical Turk (MTurk) is a persistent challenge for researchers.
  • Recent issues, including the 2018 bot crisis, have highlighted the inadequacy of traditional quality control measures like approval ratings.
  • CloudResearch has developed a vetting system to categorize MTurk workers based on data quality.

Purpose of the Study:

  • To evaluate the predictive validity of CloudResearch's worker vetting system.
  • To compare the data quality of MTurk workers categorized as "Approved" and "Blocked" by CloudResearch.
  • To assess the effectiveness of CloudResearch's vetting against standard MTurk samples.

Main Methods:

  • A pre-registered study involving 900 participants from CloudResearch's "Approved" and "Blocked" groups, plus a standard MTurk sample.
  • Participants completed various data-quality measures, including image identification, reading comprehension, attention checks, and consistency in responding to reversed items.
  • Performance was also assessed on easily searchable questions, replication of psychological effects, and AI-challenging questions.

Main Results:

  • "Approved" MTurk workers demonstrated superior performance across multiple data-quality indices compared to "Blocked" workers.
  • "Blocked" participants often performed at chance levels on several measures.
  • Standard MTurk samples generally fell between the "Approved" and "Blocked" groups in data quality.

Conclusions:

  • MTurk's standard approval rating is an unreliable indicator of data quality.
  • CloudResearch's "Approved" group offers a more dependable source of high-quality data for research.
  • Utilizing vetted worker pools can significantly enhance the reliability and validity of findings from MTurk studies.