Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Receiver Operating Characteristic Plot

Receiver Operating Characteristic Plot

A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...

Sensitivity, Specificity, and Predicted Value

Sensitivity, Specificity, and Predicted Value

In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Testing a Claim about Standard Deviation

Testing a Claim about Standard Deviation

A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...

Confidence Interval for Estimating Population Mean

Confidence Interval for Estimating Population Mean

A point estimate of the population mean is obtained from a single sample. Such a point estimate does not represent a population well because it needs to account for variability in the population. Single point estimate can also be biased despite the sample being selected randomly. Thus, a point estimate is often unreliable. A confidence interval is needed to reduce this unreliability.
A confidence interval for the mean is a range of values that provides an estimate of the population mean. As the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Increased Prevalence of Childhood Complex Trauma in Comorbid Posttraumatic Stress Disorder and Substance Use Disorders Compared to Either Disorder Alone: A Systematic Review.

Early intervention in psychiatry·2025

Same author

Estimating latent baseline-by-treatment interactions in statistical mediation analysis.

Structural equation modeling : a multidisciplinary journal·2024

Same author

Disaggregating within- and between-person associations to test the aversive transmission of alcohol use in late adolescence through adulthood.

Psychology of addictive behaviors : journal of the Society of Psychologists in Addictive Behaviors·2024

Same author

A sensitivity analysis for temporal bias in cross-sectional mediation.

Psychological methods·2023

Same author

The Effect of Noninvariance on the Estimation of the Mediated Effect in the Two-Wave Mediation Model.

Structural equation modeling : a multidisciplinary journal·2023

Same author

Effects of the COVID-19 pandemic on screen time and sleep in early adolescents.

Health psychology : official journal of the Division of Health Psychology, American Psychological Association·2023

Same journal

Modeling of parent-rated psychopathology in children and adolescents using the Child Behavior Checklist.

Psychological assessment·2026

Same journal

Racial differences in static and dynamic sexual risk assessment instruments: Static-99R and Violence Risk Scale-Sexual Offender.

Psychological assessment·2026

Same journal

Maladaptive exercise: A psychometric investigation.

Psychological assessment·2026

Same journal

Depression and suicidal ideation and behavior measurement invariance: A comparison of foreign-born and U.S.-born service members.

Psychological assessment·2026

Same journal

Longitudinal measurement invariance and psychometric properties of the Overall Anxiety Severity and Impairment Scale (OASIS), Overall Depression Severity and Impairment Scale (ODSIS), and Positive Emotion Scale (PES) among community-dwelling adults.

Psychological assessment·2026

Same journal

Differential utility of immediate versus delayed memory measures for the identification of episodic memory impairment: Systematic review and meta-analysis.

Psychological assessment·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 24, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Estimating classification consistency of machine learning models for screening measures.

Oscar Gonzalez¹, A R Georgeson², William E Pelham³

¹Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill.

Psychological Assessment

|June 3, 2024

Summary

This summary is machine-generated.

This study introduces new quantitative methods to assess classification consistency in machine learning screening models. These methods help ensure reliable diagnostic classifications by addressing sampling and measurement errors.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

Related Experiment Videos

Last Updated: Jun 24, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

Area of Science:

Psychometrics
Machine Learning
Health Informatics

Background:

Screening measures in psychology and medicine classify individuals for diagnoses.
High classification consistency is crucial for reliable screening, alongside accuracy.
Existing machine learning models lack methods to quantify classification consistency.

Purpose of the Study:

To address the gap in methods for estimating classification consistency in machine learning screening models.
To introduce novel quantitative techniques for assessing classification consistency.
To guide applied researchers in evaluating machine learning diagnostic assessments.

Main Methods:

Utilizes data resampling techniques, including bootstrap and Monte Carlo sampling.
Estimates classification inconsistency stemming from sampling error during model fitting.
Estimates classification inconsistency arising from measurement error in item responses.

Main Results:

Demonstrates methods for quantifying classification consistency in machine learning screening.
Illustrates the application of these methods using three empirical examples.
Provides R code to facilitate the implementation of the proposed techniques.

Conclusions:

Highlights the importance of classification consistency in screening measures, complementing accuracy.
Offers practical tools for researchers to obtain classification consistency indices.
Enhances the reliability assessment of machine learning models in diagnostic screening.