Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Receiver Operating Characteristic Plot01:15

Receiver Operating Characteristic Plot

83
A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...
83
Response Surface Methodology01:16

Response Surface Methodology

91
Response Surface Methodology (RSM) is a collection of statistical and mathematical techniques used to develop, improve, and optimize processes. It is particularly valuable when many input variables or factors potentially influence a response variable.
The process of RSM involves several key steps:
91
Decision Making: Traditional Method01:14

Decision Making: Traditional Method

4.0K
The process of hypothesis testing based on the traditional method includes calculating the critical value, testing the value of the test statistic using the sample data, and interpreting these values.
First, a specific claim about the population parameter is decided based on the research question and is stated in a simple form. Further, an opposing statement to this claim is also stated. These statements can act as null and alternative hypotheses, out of which a null hypothesis would be a...
4.0K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

5.6K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
5.6K
Testing a Claim about Standard Deviation01:19

Testing a Claim about Standard Deviation

2.4K
A complete procedure to test a claim about population standard deviation or population variance is explained here.
The hypothesis testing for the claim of population standard deviation (or variance) requires the data and samples to be random and unbiased. The population distribution also must be normal. There is no specific requirement on the sample size as the estimation is based on the chi-square distribution.
As a first step, the hypothesis (null and alternative) concerning the claim about...
2.4K
Critical Region, Critical Values and Significance Level01:16

Critical Region, Critical Values and Significance Level

11.8K
The critical region, critical value, and significance level are interdependent concepts crucial in hypothesis testing.
In hypothesis testing, a sample statistic is converted to a test statistic using z, t, or chi-square distribution. A critical region is an area under the curve in  probability distributions demarcated by the critical value. When the test statistic falls in this region, it suggests that the null hypothesis must be rejected. As this region contains all those values of the...
11.8K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Few and Different: Detecting Examinees With Preknowledge Using Extended Isolation Forests.

Applied psychological measurement·2025
Same author

Profiles of Adoptee Adjustment in Young Adulthood.

Adoption quarterly·2023
Same author

Pilot Study of a Patient-Centered Radiology Process Model.

Journal of the American College of Radiology : JACR·2016
Same author

Clinical outcomes following hospital-wide implementation of prolonged-infusion cefepime and ceftazidime.

International journal of antimicrobial agents·2015
Same author

Using multivariate generalizability theory to assess the effect of content stratification on the reliability of a performance assessment.

Advances in health sciences education : theory and practice·2010
Same author

Assessing the impact of modifications to the documentation component's scoring rubric and rater training on USMLE integrated clinical encounter scores.

Academic medicine : journal of the Association of American Medical Colleges·2009
Same journal

A Simple Approach for Differential Test Functioning Based on Sum Scores.

Educational and psychological measurement·2026
Same journal

Evaluating Factor Retention in Large Factor Analysis Models: A Simulation Study Comparing 15 Methods.

Educational and psychological measurement·2026
Same journal

Agreement and Alignment in Binary Rating Tasks: Strategic Convergence as an Equilibrium Outcome.

Educational and psychological measurement·2026
Same journal

Interactions Between Termination Criteria and Ability Estimators in Computerized Adaptive Testing.

Educational and psychological measurement·2026
Same journal

Identification and Diagnosis of Misreporting in Surveys.

Educational and psychological measurement·2026
Same journal

The Aggregated Latent Profile Index: Measuring Person Profile Differentiation Within a Bootstrap-Validated Latent Profile Space.

Educational and psychological measurement·2026
See all related articles

Related Experiment Video

Updated: Jun 7, 2025

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

707

Using ROC Analysis to Refine Cut Scores Following a Standard Setting Process.

Dongwei Wang1, Lisa A Keller1

  • 1University of Massachusetts Amherst, USA.

Educational and Psychological Measurement
|November 18, 2024
PubMed
Summary
This summary is machine-generated.

Optimizing educational assessment cut scores involves considering sample distribution, prevalence, and cost ratios. Adjusting cut scores based on these factors improves classification accuracy, especially in low-prevalence scenarios.

Keywords:
ROC analysiscut scorerefinestandard setting

More Related Videos

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
13:44

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Published on: August 30, 2013

42.8K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K

Related Experiment Videos

Last Updated: Jun 7, 2025

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education
09:00

Author Spotlight: Validation of SICOLE-R for Assessing Cognitive and Reading Skills in Spanish-Speaking Children and Its Role in Personalized Education

Published on: August 16, 2024

707
Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns
13:44

Detection of Architectural Distortion in Prior Mammograms via Analysis of Oriented Patterns

Published on: August 30, 2013

42.8K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K

Area of Science:

  • Educational Measurement
  • Psychometrics
  • Statistical Analysis

Background:

  • Standard setting defines cut scores in educational assessment using subject matter experts.
  • Refining cut scores requires statistical and theoretical evidence for improved classification accuracy.

Purpose of the Study:

  • Investigate the impact of sample distribution, prevalence, and cost ratio on classification accuracy.
  • Provide statistical evidence for refining cut scores in educational assessments.
  • Examine how receiver operating characteristic (ROC) analysis can inform cut score adjustments.

Main Methods:

  • Simulated 40 item responses for four sample distributions.
  • Manipulated prevalence of positive events and cost ratios (false negatives vs. false positives).
  • Utilized receiver operating characteristic (ROC) analysis and the Youden Index (J) to identify optimal cut scores.

Main Results:

  • Optimal cut scores shift towards the mode of the proficiency distribution.
  • Cut score adjustments are influenced by prevalence and cost ratio.
  • Increasing cut scores improves classification for low-prevalence events; decreasing for high-prevalence events.
  • Higher cost ratios lead to lower optimal cut scores.

Conclusions:

  • Cut score refinement is essential for accurate educational assessment.
  • Statistical evidence supports adjusting cut scores based on prevalence and cost ratios.
  • Findings offer guidance for policy decisions regarding cut score optimization.