Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.0K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.0K
Data Collection by Survey01:07

Data Collection by Survey

7.0K
The systematic method of obtaining and analyzing accurate information of a population is called data collection. A survey is a standard method of data collection that involves collecting information from a target human population about their experience, opinion, or knowledge of a product, service, or process. The responses are recorded and interpreted. The most common survey examples are written questionnaires, face-to-face or telephonic conversations, focus groups, and electronic (e-mail or...
7.0K
Systematic Error: Methodological and Sampling Errors01:15

Systematic Error: Methodological and Sampling Errors

2.3K
In the case of systematic errors, the sources can be identified, and the errors can be subsequently minimized by addressing these sources. According to the source, systematic errors can be divided into sampling, instrumental, methodological, and personal errors.
Sampling errors originate from improper sampling methods or the wrong sample population. These errors can be minimized by refining the sampling strategy. Defective instruments or faulty calibrations are the sources of instrumental...
2.3K
Convenience Sampling Method00:55

Convenience Sampling Method

9.6K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population. The sampling method ensures that samples are drawn without bias and accurately represent the population.
Convenience sampling is a non-random method of sample selection; this method selects individuals that are easily accessible and may result in biased data. For example, a marketing...
9.6K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.4K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.4K
Bias01:22

Bias

4.9K
Bias refers to any tendency that prevents a question from being considered unprejudiced. In research, bias occurs when one outcome or answer is selected or encouraged over others in sampling or testing. Bias can occur during any research phase, including study design, data collection, analysis, and publication.
In statistics, a sampling bias is created when a sample is collected from a population, and some members of the population are not as likely to be chosen as others (remember, each member...
4.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Competition in the Segregation Mechanism of Granular Flow Within a 2D Rotating Drum Based on Magnetic Positioning Technology.

Sensors (Basel, Switzerland)·2026
Same author

Dose Responses to Supplemental Polyacrylamide on Digestion, Metabolism, and Ruminal Digestive-Enzyme Activities in Cattle.

Life (Basel, Switzerland)·2025
Same author

Publisher Correction: Data quality in crowdsourcing and spamming behavior detection.

Behavior research methods·2025
Same author

Attribution of vegetation changes in China based on improved residual trend method.

Ying yong sheng tai xue bao = The journal of applied ecology·2025
Same author

TREM2 and Pain Development: An Old Molecule, a New Target.

Journal of neurochemistry·2025
Same author

Positive Emotion Enhances Memory by Promoting Memory Reinstatement across Repeated Learning.

The Journal of neuroscience : the official journal of the Society for Neuroscience·2025
Same journal

Planned missingness in intensive longitudinal studies: Extensions and comparisons of multiform designs.

Behavior research methods·2026
Same journal

A validity-guided workflow for robust large language model research in psychology.

Behavior research methods·2026
Same journal

Are 7-point Likert scales preferable to 5-point scales in language research?

Behavior research methods·2026
Same journal

Generative psychometrics via AI-GENIE: Automatic item generation and validation with network-integrated evaluation.

Behavior research methods·2026
Same journal

Exploring psychological tradeoffs: Developing and demonstrating an R Shiny app for Pareto optimization.

Behavior research methods·2026
Same journal

The performance of Bayesian fit measures in detecting misspecified multilevel structural equation modeling.

Behavior research methods·2026
See all related articles

Related Experiment Video

Updated: Sep 12, 2025

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences
08:33

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

Published on: September 4, 2019

7.1K

Data quality in crowdsourcing and spamming behavior detection.

Yang Ba1, Michelle V Mancenido2, Erin K Chiou3

  • 1Ira A. Fulton Schools of Engineering, School of Computing and Augmented Intelligence, Data Science, Analytics and Engineering, Arizona State University, Suite 342AE, 3rd floor 699 S. Mill Avenue, 85281, Tempe, AZ, USA. yangba@asu.edu.

Behavior Research Methods
|August 8, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a novel method to evaluate crowdsourced data quality and detect spammers. It enhances machine learning by assessing annotator consistency and credibility, crucial for reliable AI development.

Keywords:
Crowdsourcing platformData qualityGeneralized random effects modelsMetricsSpamming behaviorsStatistical hypothesis testing

More Related Videos

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community
08:53

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

5.3K
Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

10.8K

Related Experiment Videos

Last Updated: Sep 12, 2025

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences
08:33

A Cross-Disciplinary and Multi-Modal Experimental Design for Studying Near-Real-Time Authentic Examination Experiences

Published on: September 4, 2019

7.1K
Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community
08:53

Integrating Computerized Linguistic and Social Network Analyses to Capture Addiction Recovery Capital in an Online Community

Published on: May 31, 2019

5.3K
Design and Analysis for Fall Detection System Simplification
08:05

Design and Analysis for Fall Detection System Simplification

Published on: April 6, 2020

10.8K

Area of Science:

  • Machine Learning
  • Data Science
  • Artificial Intelligence

Background:

  • Crowdsourcing is vital for labeling machine learning datasets efficiently.
  • Assessing crowd-provided data quality is essential to reduce bias and improve AI performance.
  • Traditional quality metrics are insufficient for complex online crowdsourcing scenarios.

Purpose of the Study:

  • To develop a systematic method for evaluating data quality in crowdsourcing.
  • To detect and classify spamming threats from crowd workers.
  • To measure annotator consistency and credibility without ground truth.

Main Methods:

  • Variance decomposition for data quality evaluation and spammer detection.
  • Classification of spammers into three behavioral categories.
  • Development of a spammer index for overall data consistency.
  • Utilizing Markov chain and generalized random effects models for worker credibility metrics.

Main Results:

  • A practical framework for assessing crowdsourced data quality was demonstrated.
  • The proposed methods effectively identified and categorized spammers.
  • The techniques proved advantageous in a face verification task using real and simulated data.

Conclusions:

  • The developed systematic method enhances the reliability of crowdsourced data for machine learning.
  • Accurate assessment of annotator consistency and credibility is achievable even without ground truth.
  • This approach is crucial for mitigating biases and improving the performance of AI models trained on crowdsourced data.