Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.6K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.6K
Censoring Survival Data01:09

Censoring Survival Data

274
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
274
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.5K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.5K
Response Surface Methodology01:16

Response Surface Methodology

313
Response Surface Methodology (RSM) is a collection of statistical and mathematical techniques used to develop, improve, and optimize processes. It is particularly valuable when many input variables or factors potentially influence a response variable.
The process of RSM involves several key steps:
313
Surveys02:16

Surveys

16.1K
Often, psychologists develop surveys as a means of gathering data. Surveys are lists of questions to be answered by research participants, and can be delivered as paper-and-pencil questionnaires, administered electronically, or conducted verbally. Generally, the survey itself can be completed in a short time, and the ease of administering a survey makes it easy to collect data from a large number of people.
16.1K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

3.3K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
3.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Beyond the hype: A simulation study evaluating the predictive performance of machine learning models in psychology.

Psychological methods·2026
Same author

Comparing Different Approaches of (Not) Accounting for Rapid Guessing in Plausible Values Estimation.

Educational and psychological measurement·2026
Same author

Predicting Juvenile Delinquency and Criminal Behavior in Adulthood Using Machine Learning.

International journal of behavioral development·2025
Same author

Revisiting the structure of Diagnostic and Statistical Manual of Mental Disorders, fifth edition, Section II personality disorder criteria using individual participant data meta-analysis.

Personality disorders·2025
Same author

Data from the National Educational Panel Study (NEPS) in Germany: Educational Pathways of Students in Grade 5 and Higher.

Journal of open psychology data·2025
Same author

Data for Psychological Research in the Educational Field: Spotlights, Data Infrastructures, and Findings from Research.

Journal of open psychology data·2025
Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026
Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026
Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026
Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026
Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Oct 7, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K

Detecting Careless Responding in Survey Data Using Stochastic Gradient Boosting.

Ulrich Schroeders1, Christoph Schmidt2, Timo Gnambs3

  • 1University of Kassel, Kassel, Germany.

Educational and Psychological Measurement
|January 7, 2022
PubMed
Summary
This summary is machine-generated.

Gradient boosted trees, a machine learning method, were tested for detecting careless survey responses. While effective in simulations, this approach did not outperform traditional methods in real-world studies.

Keywords:
careless respondingdata cleaninggradient boosted treesoutlier detectionresponse times

More Related Videos

Dual-Task Stroop Paradigm for Detecting Cognitive Deficits in High-Functioning Stroke Patients
07:42

Dual-Task Stroop Paradigm for Detecting Cognitive Deficits in High-Functioning Stroke Patients

Published on: December 16, 2022

3.2K
Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

9.3K

Related Experiment Videos

Last Updated: Oct 7, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K
Dual-Task Stroop Paradigm for Detecting Cognitive Deficits in High-Functioning Stroke Patients
07:42

Dual-Task Stroop Paradigm for Detecting Cognitive Deficits in High-Functioning Stroke Patients

Published on: December 16, 2022

3.2K
Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

9.3K

Area of Science:

  • Psychological Measurement
  • Survey Methodology
  • Machine Learning Applications

Background:

  • Careless responding poses a significant threat to the reliability and validity of psychological measurements by disregarding item content.
  • Existing methods for detecting aberrant responses include probing questions, paradata (e.g., response times), and statistical techniques (e.g., Mahalanobis distance).

Purpose of the Study:

  • To introduce gradient boosted trees, a machine learning technique, for identifying careless respondents in survey data.
  • To compare the performance of gradient boosting machines against established detection methods using simulated and empirical data.

Main Methods:

  • Gradient boosted trees were employed as a novel machine learning approach to detect careless responding.
  • Performance was evaluated against traditional methods (outlier methods, consistency analyses, response pattern functions).
  • Both simulated data and empirical data from an experimentally induced careless responding study were utilized.

Main Results:

  • In simulation studies, gradient boosting machines demonstrated superior performance in flagging aberrant responses compared to traditional methods.
  • This performance advantage did not translate to the empirical study; precision was unsatisfactory for both novel and traditional methods.
  • Real-world survey responses appeared more erratic than anticipated by simulation studies, impacting detection accuracy.

Conclusions:

  • The effectiveness of gradient boosting machines for detecting careless responding is promising in simulations but requires further validation in real-world settings.
  • Current detection methods, both traditional and novel, exhibit limitations in precision for identifying aberrant response patterns.
  • Future research should focus on improving the generalizability and accuracy of detection methods for real-world survey data.