Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Data Collection by Survey

Data Collection by Survey

The systematic method of obtaining and analyzing accurate information of a population is called data collection. A survey is a standard method of data collection that involves collecting information from a target human population about their experience, opinion, or knowledge of a product, service, or process. The responses are recorded and interpreted. The most common survey examples are written questionnaires, face-to-face or telephonic conversations, focus groups, and electronic (e-mail or...

Systematic Error: Methodological and Sampling Errors

Systematic Error: Methodological and Sampling Errors

In the case of systematic errors, the sources can be identified, and the errors can be subsequently minimized by addressing these sources. According to the source, systematic errors can be divided into sampling, instrumental, methodological, and personal errors.
Sampling errors originate from improper sampling methods or the wrong sample population. These errors can be minimized by refining the sampling strategy. Defective instruments or faulty calibrations are the sources of instrumental...

Convenience Sampling Method

Convenience Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population.
Convenience sampling is a non-random method of sample selection; this method selects individuals that are easily accessible and may result in biased data. For example, a marketing...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Bias

Bias

Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Competition in the Segregation Mechanism of Granular Flow Within a 2D Rotating Drum Based on Magnetic Positioning Technology.

Sensors (Basel, Switzerland)·2026

Same author

Dose Responses to Supplemental Polyacrylamide on Digestion, Metabolism, and Ruminal Digestive-Enzyme Activities in Cattle.

Life (Basel, Switzerland)·2025

Same author

Publisher Correction: Data quality in crowdsourcing and spamming behavior detection.

Behavior research methods·2025

Same author

Attribution of vegetation changes in China based on improved residual trend method.

Ying yong sheng tai xue bao = The journal of applied ecology·2025

Same author

TREM2 and Pain Development: An Old Molecule, a New Target.

Journal of neurochemistry·2025

Same author

Positive Emotion Enhances Memory by Promoting Memory Reinstatement across Repeated Learning.

The Journal of neuroscience : the official journal of the Society for Neuroscience·2025

Same journal

Planned missingness in intensive longitudinal studies: Extensions and comparisons of multiform designs.

Behavior research methods·2026

Same journal

A validity-guided workflow for robust large language model research in psychology.

Behavior research methods·2026

Same journal

Are 7-point Likert scales preferable to 5-point scales in language research?

Behavior research methods·2026

Same journal

Generative psychometrics via AI-GENIE: Automatic item generation and validation with network-integrated evaluation.

Behavior research methods·2026

Same journal

Exploring psychological tradeoffs: Developing and demonstrating an R Shiny app for Pareto optimization.

Behavior research methods·2026

Same journal

The performance of Bayesian fit measures in detecting misspecified multilevel structural equation modeling.

Behavior research methods·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 12, 2025

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

Published on: September 4, 2019

Data quality in crowdsourcing and spamming behavior detection.

Yang Ba¹, Michelle V Mancenido², Erin K Chiou³

¹Ira A. Fulton Schools of Engineering, School of Computing and Augmented Intelligence, Data Science, Analytics and Engineering, Arizona State University, Suite 342AE, 3rd floor 699 S. Mill Avenue, 85281, Tempe, AZ, USA. yangba@asu.edu.

Behavior Research Methods

|August 8, 2025

Summary

This summary is machine-generated.

This study introduces a novel method to evaluate crowdsourced data quality and detect spammers. It enhances machine learning by assessing annotator consistency and credibility, crucial for reliable AI development.

Keywords:

Crowdsourcing platform Data quality Generalized random effects models Metrics Spamming behaviors Statistical hypothesis testing

More Related Videos

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

Design and Analysis for Fall Detection System Simplification

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

Related Experiment Videos

Last Updated: Sep 12, 2025

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

Published on: September 4, 2019

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

Design and Analysis for Fall Detection System Simplification

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

Area of Science:

Machine Learning
Data Science
Artificial Intelligence

Background:

Crowdsourcing is vital for labeling machine learning datasets efficiently.
Assessing crowd-provided data quality is essential to reduce bias and improve AI performance.
Traditional quality metrics are insufficient for complex online crowdsourcing scenarios.

Purpose of the Study:

To develop a systematic method for evaluating data quality in crowdsourcing.
To detect and classify spamming threats from crowd workers.
To measure annotator consistency and credibility without ground truth.

Main Methods:

Variance decomposition for data quality evaluation and spammer detection.
Classification of spammers into three behavioral categories.
Development of a spammer index for overall data consistency.
Utilizing Markov chain and generalized random effects models for worker credibility metrics.

Main Results:

A practical framework for assessing crowdsourced data quality was demonstrated.
The proposed methods effectively identified and categorized spammers.
The techniques proved advantageous in a face verification task using real and simulated data.

Conclusions:

The developed systematic method enhances the reliability of crowdsourced data for machine learning.
Accurate assessment of annotator consistency and credibility is achievable even without ground truth.
This approach is crucial for mitigating biases and improving the performance of AI models trained on crowdsourced data.