Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

2.5K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
2.5K
Receiver Operating Characteristic Plot01:15

Receiver Operating Characteristic Plot

137
A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...
137
Sensitivity, Specificity, and Predicted Value01:13

Sensitivity, Specificity, and Predicted Value

284
In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...
284
Goodness-of-Fit Test01:16

Goodness-of-Fit Test

3.3K
The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...
3.3K
Testing a Claim about Standard Deviation01:19

Testing a Claim about Standard Deviation

2.4K
A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...
2.4K
Confidence Interval for Estimating Population Mean01:25

Confidence Interval for Estimating Population Mean

7.3K
A point estimate of the population mean is obtained from a single sample. Such a point estimate does not represent a population well because it needs to account for variability in the population. Single point estimate can also be biased despite the sample being selected randomly. Thus, a point estimate is often unreliable. A confidence interval is needed to reduce this unreliability.
A confidence interval for the mean is a range of values that provides an estimate of the population mean. As the...
7.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Increased Prevalence of Childhood Complex Trauma in Comorbid Posttraumatic Stress Disorder and Substance Use Disorders Compared to Either Disorder Alone: A Systematic Review.

Early intervention in psychiatry·2025
Same author

Estimating latent baseline-by-treatment interactions in statistical mediation analysis.

Structural equation modeling : a multidisciplinary journal·2024
Same author

Disaggregating within- and between-person associations to test the aversive transmission of alcohol use in late adolescence through adulthood.

Psychology of addictive behaviors : journal of the Society of Psychologists in Addictive Behaviors·2024
Same author

A sensitivity analysis for temporal bias in cross-sectional mediation.

Psychological methods·2023
Same author

The Effect of Noninvariance on the Estimation of the Mediated Effect in the Two-Wave Mediation Model.

Structural equation modeling : a multidisciplinary journal·2023
Same author

Effects of the COVID-19 pandemic on screen time and sleep in early adolescents.

Health psychology : official journal of the Division of Health Psychology, American Psychological Association·2023
Same journal

Modeling of parent-rated psychopathology in children and adolescents using the Child Behavior Checklist.

Psychological assessment·2026
Same journal

Racial differences in static and dynamic sexual risk assessment instruments: Static-99R and Violence Risk Scale-Sexual Offender.

Psychological assessment·2026
Same journal

Maladaptive exercise: A psychometric investigation.

Psychological assessment·2026
Same journal

Depression and suicidal ideation and behavior measurement invariance: A comparison of foreign-born and U.S.-born service members.

Psychological assessment·2026
Same journal

Longitudinal measurement invariance and psychometric properties of the Overall Anxiety Severity and Impairment Scale (OASIS), Overall Depression Severity and Impairment Scale (ODSIS), and Positive Emotion Scale (PES) among community-dwelling adults.

Psychological assessment·2026
Same journal

Differential utility of immediate versus delayed memory measures for the identification of episodic memory impairment: Systematic review and meta-analysis.

Psychological assessment·2026
See all related articles

Related Experiment Video

Updated: Jun 24, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K

Estimating classification consistency of machine learning models for screening measures.

Oscar Gonzalez1, A R Georgeson2, William E Pelham3

  • 1Department of Psychology and Neuroscience, University of North Carolina at Chapel Hill.

Psychological Assessment
|June 3, 2024
PubMed
Summary
This summary is machine-generated.

This study introduces new quantitative methods to assess classification consistency in machine learning screening models. These methods help ensure reliable diagnostic classifications by addressing sampling and measurement errors.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K
Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma
04:09

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

8.2K

Related Experiment Videos

Last Updated: Jun 24, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.5K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K
Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma
04:09

Predicting Treatment Response to Image-Guided Therapies Using Machine Learning: An Example for Trans-Arterial Treatment of Hepatocellular Carcinoma

Published on: October 10, 2018

8.2K

Area of Science:

  • Psychometrics
  • Machine Learning
  • Health Informatics

Background:

  • Screening measures in psychology and medicine classify individuals for diagnoses.
  • High classification consistency is crucial for reliable screening, alongside accuracy.
  • Existing machine learning models lack methods to quantify classification consistency.

Purpose of the Study:

  • To address the gap in methods for estimating classification consistency in machine learning screening models.
  • To introduce novel quantitative techniques for assessing classification consistency.
  • To guide applied researchers in evaluating machine learning diagnostic assessments.

Main Methods:

  • Utilizes data resampling techniques, including bootstrap and Monte Carlo sampling.
  • Estimates classification inconsistency stemming from sampling error during model fitting.
  • Estimates classification inconsistency arising from measurement error in item responses.

Main Results:

  • Demonstrates methods for quantifying classification consistency in machine learning screening.
  • Illustrates the application of these methods using three empirical examples.
  • Provides R code to facilitate the implementation of the proposed techniques.

Conclusions:

  • Highlights the importance of classification consistency in screening measures, complementing accuracy.
  • Offers practical tools for researchers to obtain classification consistency indices.
  • Enhances the reliability assessment of machine learning models in diagnostic screening.