Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Reliability and Validity

Reliability and Validity

Reliability and validity are two important considerations that must be made with any type of data collection. Reliability refers to the ability to consistently produce a given result. In the context of psychological research, this would mean that any instruments or tools used to collect data do so in consistent, reproducible ways.

Accuracy and Errors in Hypothesis Testing

Accuracy and Errors in Hypothesis Testing

Hypothesis testing is a fundamental statistical tool that begins with the assumption that the null hypothesis H0 is true. During this process, two types of errors can occur: Type I and Type II. A Type I error refers to the incorrect rejection of a true null hypothesis, while a Type II error involves the failure to reject a false null hypothesis.
In hypothesis testing, the probability of making a Type I error, denoted as α, is commonly set at 0.05. This significance level indicates a 5% chance...

Accuracy and Precision

Accuracy and Precision

Scientists typically make repeated measurements of a quantity to ensure the quality of their findings and to evaluate both the precision and the accuracy of their results. Measurements are said to be precise if they yield very similar results when repeated in the same manner. A measurement is considered accurate if it yields a result that is very close to the true or the accepted value. Precise values agree with each other; accurate values agree with a true value. Highly accurate measurements...

Receiver Operating Characteristic Plot

Receiver Operating Characteristic Plot

A ROC (Receiver Operating Characteristic) plot is a graphical tool used to assess the performance of a binary classification model by illustrating the trade-off between sensitivity (true positive rate) and specificity (false positive rate). By plotting sensitivity against 1 - specificity across various threshold settings, the ROC curve shows how well the model distinguishes between classes, with a curve closer to the top-left corner indicating a more accurate model. The area under the ROC curve...

Sensitivity, Specificity, and Predicted Value

Sensitivity, Specificity, and Predicted Value

In healthcare diagnostics, laboratory tests play a crucial role in identifying and diagnosing a wide range of medical conditions. However, interpreting test results is not always straightforward. An abnormal test result does not always confirm the presence of a disease, just as a normal result does not guarantee its absence. To assess the reliability of these diagnostic tools, healthcare practitioners rely on two key statistical indicators: sensitivity and specificity.
Sensitivity is the...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Prognostic Factors in Early-stage NSCLC: Analysis of the Placebo Group in the MAGRIT Study.

Anticancer research·2019

Same author

Pancreatic kallikrein protects against diabetic retinopathy in KK Cg-A<sup>y</sup>/J and high-fat diet/streptozotocin-induced mouse models of type 2 diabetes.

Diabetologia·2019

Same author

A more efficient ocular delivery system of triamcinolone acetonide as eye drop to the posterior segment of the eye.

Drug delivery·2019

Same author

Novel Strategy of Gene Delivery System Based on Dendrimer Loaded Recombinant Hirudine Plasmid for Thrombus Targeting Therapy.

Molecular pharmaceutics·2019

Same author

A visible light photoredox catalyzed carbon radical-mediated generation of ortho-quinone methides for 2,3-dihydrobenzofuran synthesis.

Chemical communications (Cambridge, England)·2019

Same author

Prognostic factors of refractory NSCLC patients receiving anlotinib hydrochloride as the third- or further-line treatment.

Cancer biology & medicine·2019

Same journal

Planned missingness in intensive longitudinal studies: Extensions and comparisons of multiform designs.

Behavior research methods·2026

Same journal

A validity-guided workflow for robust large language model research in psychology.

Behavior research methods·2026

Same journal

Are 7-point Likert scales preferable to 5-point scales in language research?

Behavior research methods·2026

Same journal

Generative psychometrics via AI-GENIE: Automatic item generation and validation with network-integrated evaluation.

Behavior research methods·2026

Same journal

Exploring psychological tradeoffs: Developing and demonstrating an R Shiny app for Pareto optimization.

Behavior research methods·2026

Same journal

The performance of Bayesian fit measures in detecting misspecified multilevel structural equation modeling.

Behavior research methods·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 18, 2026

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Classification accuracy and consistency of computerized adaptive testing.

Ying Cheng¹, Deanna L Morgan

¹University of Notre Dame, Notre Dame, IN, USA. ycheng4@nd.edu

Behavior Research Methods

|September 8, 2012

Summary

This summary is machine-generated.

The Maximum Priority Index (MPI) method excels in computerized adaptive testing, accurately classifying examinees while managing constraints and ensuring consistent test forms. Its performance improves with test length.

More Related Videos

Advancing Dyslexia Assessment in Children Through Computerized Testing

Advancing Dyslexia Assessment in Children Through Computerized Testing

Published on: August 16, 2024

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

Published on: February 13, 2020

Related Experiment Videos

Last Updated: May 18, 2026

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Advancing Dyslexia Assessment in Children Through Computerized Testing

Advancing Dyslexia Assessment in Children Through Computerized Testing

Published on: August 16, 2024

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

A Computerized Functional Skills Assessment and Training Program Targeting Technology Based Everyday Functional Skills

Published on: February 13, 2020

Area of Science:

Psychometrics
Educational Measurement
Computerized Adaptive Testing (CAT)

Background:

Accurate and consistent examinee classification is crucial in educational and psychological assessments.
Computerized adaptive testing (CAT) offers efficient and precise measurement but relies heavily on effective item selection strategies.
Existing item selection methods face challenges in managing constraints and ensuring test parallelism.

Purpose of the Study:

To evaluate and compare four distinct item selection methods in CAT.
To assess methods based on classification accuracy, consistency, and constraint management.
To identify the most effective method for generating parallel test forms and ensuring classification consistency.

Main Methods:

Comparison of Maximum Priority Index (MPI), weighted deviation modeling, maximum Fisher information, and randomized item selection.
Evaluation of classification accuracy and consistency across different test lengths and parameters.
Analysis of constraint management capabilities, including test overlap and content coverage.

Main Results:

The MPI method effectively manages constraints and minimizes test overlap.
MPI is the only method capable of producing parallel forms with consistent content coverage.
MPI demonstrates strong examinee classification accuracy and consistency, even with short tests (12 items).
Performance of MPI improves with longer test durations.

Conclusions:

The MPI method is highly recommended for computerized adaptive testing due to its superior performance in managing constraints and ensuring classification consistency.
MPI's ability to create parallel forms makes it uniquely suitable for applications requiring consistent measurement across different test versions.
Further research should explore the impact of decision categories and cut score locations on MPI's effectiveness.