Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Self-Report Tests of Personality

Self-Report Tests of Personality

Self-report inventories are objective personality assessments that use multiple-choice items or numbered scales, typically ranging from 1 (strongly disagree) to 5 (strongly agree). They are often called Likert scales after Rensis Likert. These inventories are widely used due to their ease of administration and cost-effectiveness. One of the most prominent examples is the Minnesota Multiphasic Personality Inventory (MMPI), initially developed in the 1940s to assess abnormal personality traits.

Local Anesthetics: Differential Sensitivity of Nerve Fibers

Local Anesthetics: Differential Sensitivity of Nerve Fibers

Local anesthetics (LAs) block the sodium channels of nerve trunks, sensory nerve endings, and neuromuscular junctions. Although LAs can block all kinds of nerves, the sensitivity of nerve fibers differs according to nerve types and structures. LAs are known to block myelinated fibers faster than unmyelinated ones. Also, they block pain or sensory neurons at low concentrations without affecting the motor neurons involved in muscle contractions. This helps relieve labor pain without affecting the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Tjap1/Pilt Is a cis-Golgi-Associated Protein Required for Golgi Integrity and Normal Drug Transporter Expression in Brain Microvascular Endothelial Cells In Vitro.

Pharmaceutics·2026

Same author

Comparing the Weighted Gain Score and a Rasch-Based Approach for Estimating Learning Outcomes in Medical Education: Quantitative Study.

JMIR medical education·2026

Same author

Thermal Safety of Forced-Air Warming During Balloon Occlusion in Isolated Perfusion Chemotherapy: A Prospective Feasibility Study Using Multisite Temperature Monitoring.

Cancers·2026

Same author

Mobile Learning in Medical Education: Quasi-Experimental Realist Evaluation of Usage, Context, and Examination Performance in a Curricular Setting.

JMIR medical education·2026

Same author

Evaluation of a Cognitive Aid Application to Improve Non-Technical Skills in Simulated Cardiopulmonary Resuscitation (CPR): A Randomised Controlled Trial.

Clinics and practice·2026

Same author

The Knockout of Protocadherin Gamma C3 (PCDHGC3) in Breast Cancer and Melanoma Cell Lines Leads to Increased Adhesion of Knockout Cells to Brain Microvascular Endothelial Cells.

NeuroSci·2026

Same journal

Stakeholder Experiences With the Pneumococcal Conjugate Vaccine Chatbot as a Complementary Capacity-Building Tool for Frontline Health Workers in India: Qualitative Study.

JMIR formative research·2026

Same journal

Acceptability and Perceived Usefulness of a Digital Gambling Harm Minimisation Tool: A Cross-Sectional Study.

JMIR formative research·2026

Same journal

Knowledge Graphs Based on Meta-Analysis Papers Improve the Quality of Case Formulation: Mixed Methods Design.

JMIR formative research·2026

Same journal

Expedited Transition to Digital Delivery of Recovery Support Services Due to the COVID-19 Pandemic: Mixed Methods Needs Assessment.

JMIR formative research·2026

Same journal

Impact of an mHealth App on Digital Transformation: Randomized Clinical Trial on Strengthening Digital Skills in Older Women.

JMIR formative research·2026

Same journal

Emotion Classification in Japanese Cancer Survivor Interview Narratives Using Sentiment Polarity and Plutchik Emotion Frameworks: Model Development and Evaluation Study.

JMIR formative research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 20, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Fine-Tuned Large Language Models for Generating Multiple-Choice Questions in Anesthesiology: Psychometric Comparison

Carlos Ramon Hölzing¹, Charlotte Meynhardt¹, Patrick Meybohm¹

¹Department of Anaesthesiology, Intensive Care, Emergency and Pain Medicine, University Hospital Würzburg, Oberdürrbacher Str. 6, Würzburg, 97080, Germany.

JMIR Formative Research

|February 18, 2026

Summary

This summary is machine-generated.

Fine-tuned large language models (LLMs) can create multiple-choice questions (MCQs) in anesthesiology with similar psychometric properties to those written by faculty experts. Automated item generation can complement, not replace, traditional methods for developing high-quality medical education assessments.

Keywords:

anesthesiology artificial intelligence assessment fine-tuning large language models medical education multiple-choice questions psychometrics

More Related Videos

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Published on: May 18, 2019

Related Experiment Videos

Last Updated: Feb 20, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Computerized Adaptive Testing System of Functional Assessment of Stroke

Computerized Adaptive Testing System of Functional Assessment of Stroke

Published on: January 7, 2019

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Published on: May 18, 2019

Area of Science:

Medical Education
Artificial Intelligence in Assessment
Psychometrics

Background:

Multiple-choice questions (MCQs) are crucial for standardized medical assessment.
Developing high-quality MCQs requires subject expertise and rigorous methodology.
Large language models (LLMs) present opportunities for automated MCQ generation, but evaluations are limited.

Purpose of the Study:

To assess if a fine-tuned LLM can generate anesthesiology MCQs with psychometric properties comparable to faculty-written items.

Main Methods:

A fine-tuned GPT-4 model was trained on anesthesiology materials.
The model generated 15 MCQs, which were analyzed alongside 15 faculty-written MCQs.
Item analysis followed psychometric standards, comparing difficulty, point-biserial correlation, and discrimination index.

Main Results:

No significant differences were found in difficulty, point-biserial correlation, or discrimination index between LLM-generated and faculty-written MCQs.
Both sets of MCQs demonstrated modest overall psychometric quality.
LLM-generated items (mean difficulty 0.79, point-biserial 0.17, discrimination 0.08) were comparable to expert items (mean difficulty 0.81, point-biserial 0.19, discrimination 0.09).

Conclusions:

Supervised fine-tuned LLMs can produce MCQs with psychometric quality similar to expert faculty.
Automated item generation should supplement, not replace, manual MCQ development.
Further research is needed for generalizability and optimizing LLM integration in assessment.