Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Proficiency order invariance of MLE, MAP, EAP, and WLE in item response theory.

The British journal of mathematical and statistical psychology·2026
Same author

Toward Precision Cardiac Rehabilitation: Current Limitations and Future Opportunities of Omics and Artificial Intelligence.

Sports medicine (Auckland, N.Z.)·2026
Same author

An Experimental Design to Investigate Item Parameter Drift.

Applied psychological measurement·2025
Same author

Proximity to Practice: The Role of Technology in the Next Era of Assessment.

Perspectives on medical education·2024
Same author

A Comparison of Remote vs In-Person Proctored In-Training Examination Administration for Internal Medicine.

Academic medicine : journal of the Association of American Medical Colleges·2024
Same author

A Clear Cell Sarcoma Case: A Diagnostic and Treatment Challenge, with a Promising Response to Trabectedin.

Case reports in oncology·2023
Same journal

The "Twilight Zone" Is a Danger Zone: Why the Occupational-Clinical Divide in Burnout Assessment Is a False Dichotomy.

Evaluation & the health professions·2026
Same journal

Evaluating Equity in AI-Supported Functional Assessment: Agreement Between Clinician Judgment and Digital Metrics in Stroke Rehabilitation.

Evaluation & the health professions·2026
Same journal

Psychometric Properties of the Arabic Version of the PROMIS Sleep Disturbance 8b Short Form Among Nurses.

Evaluation & the health professions·2026
Same journal

Commentary: Systemic Inequities in Japan's Technical Intern Training Program (TITP): Health, Labor, and Legal Vulnerabilities of Foreign Trainees.

Evaluation & the health professions·2026
Same journal

Application of Patient-Reported Outcome Measurements in Traditional Chinese Medicine Clinical Trials for Musculoskeletal Disorders in China: A Registry-Based Analysis.

Evaluation & the health professions·2026
Same journal

Divergent Socioeconomic Pathways to Biologically Uncontrolled Diabetes by Gender: A Bayesian Analysis of NHANES 2021-2023.

Evaluation & the health professions·2026
See all related articles

Related Experiment Video

Updated: Jul 2, 2026

Irrelevant Stimuli and Action Control: Analyzing the Influence of Ignored Stimuli via the Distractor-Response Binding Paradigm
12:12

Irrelevant Stimuli and Action Control: Analyzing the Influence of Ignored Stimuli via the Distractor-Response Binding Paradigm

Published on: May 14, 2014

A Natural-Language-Processing-Based Procedure for Generating Distractors for Multiple-Choice Questions.

Peter Baldwin1, Janet Mee1, Victoria Yaneva1

  • 1National Board of Medical Examiners, Philadelphia, PA, USA.

Evaluation & the Health Professions
|November 10, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces an automated method for generating multiple-choice test question distractors using natural language processing. The system successfully identified plausible distractors, aiding human item writers in test development.

Keywords:
automatic item generationitem writinglarge-scale testingnatural language processingtest development

More Related Videos

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks
08:32

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks

Published on: September 5, 2019

Advancing Dyslexia Assessment in Children Through Computerized Testing
09:00

Advancing Dyslexia Assessment in Children Through Computerized Testing

Published on: August 16, 2024

Related Experiment Videos

Last Updated: Jul 2, 2026

Irrelevant Stimuli and Action Control: Analyzing the Influence of Ignored Stimuli via the Distractor-Response Binding Paradigm
12:12

Irrelevant Stimuli and Action Control: Analyzing the Influence of Ignored Stimuli via the Distractor-Response Binding Paradigm

Published on: May 14, 2014

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks
08:32

Examining Online Syntactic Processing of Spoken Complex Sentences in Chinese Using Dual-Modal Interference Tasks

Published on: September 5, 2019

Advancing Dyslexia Assessment in Children Through Computerized Testing
09:00

Advancing Dyslexia Assessment in Children Through Computerized Testing

Published on: August 16, 2024

Area of Science:

  • Medical Education
  • Natural Language Processing
  • Psychometrics

Background:

  • Developing effective multiple-choice questions (MCQs) is challenging, particularly in creating plausible incorrect response options (distractors).
  • Existing item banks represent a valuable resource for generating new assessment items.

Purpose of the Study:

  • To introduce and evaluate a procedure for automatically mining item banks to generate potential distractors for new MCQs.
  • To assess the utility of system-generated distractors for human item writers.

Main Methods:

  • Utilized natural language processing (NLP) to measure semantic similarity between new item stems/answers and existing item bank content.
  • Developed a distractor generation model requiring a substantial pool of items.
  • Evaluated system-produced distractors against human-produced distractors using United States Medical Licensing Examination (USMLE) data.
  • Assessed the quality and relevance of system-generated distractors with experienced item writers.

Main Results:

  • For approximately 50% of items, at least one top system-generated distractor matched a human-produced distractor.
  • For about 25% of items, two of the top three system-generated distractors matched human-produced distractors.
  • Item writers rated 81% of system-generated distractors as on-topic and 56% as helpful for distractor development.

Conclusions:

  • Automated distractor generation using NLP is a feasible approach to support MCQ development.
  • The proposed method shows promise in identifying relevant and plausible distractors, assisting item writers in medical education and other fields.
  • Further refinement of NLP techniques can enhance the efficiency and effectiveness of creating high-quality assessment items.