Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: May 27, 2026

Utilizing a 3D Printed Laparoscopic Nissen Fundoplication Model to Shorten a Resident's Learning Curve

Utilizing a 3D Printed Laparoscopic Nissen Fundoplication Model to Shorten a Resident's Learning Curve

Published on: August 15, 2025

Evaluating Injection Laryngoplasty Skills Using a Foundation Model: A Feasibility Study.

Alex T Cheng¹, Abdulla Elkhadrawy¹, Sean A Setzen¹

¹Department of Otolaryngology-Head and Neck Surgery, Weill Cornell Medicine, New York, New York, USA.

The Laryngoscope

|May 26, 2026

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Extended Reality in Otolaryngology-Head & Neck Surgery: A State-of-the-Art Review.

Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery·2026

Same author

Public Perceptions of Ankyloglossia on Reddit: A Cross-Sectional Thematic and Sentiment Analysis.

Laryngoscope investigative otolaryngology·2026

Same author

Women Pioneers in Laryngology: The First Female Fellows of the American Laryngological Association.

The Laryngoscope·2026

Same author

Fresh-Frozen Costal Cartilage in Rhinoplasty: A Six-Year Experience.

Facial plastic surgery & aesthetic medicine·2026

Same author

Human Papillomavirus Vaccine Discourse and Sentiment on Reddit Before and After COVID-19: Mixed Methods Retrospective Cross-Sectional Study.

Journal of medical Internet research·2026

Same author

Gender and Academic Rank Disparities in Electronic Health Record Burden Among Otolaryngologists.

The Laryngoscope·2026

Same journal

Laryngeal IgG4-Related Disease: A Systematic Review of Clinical Features and Management.

The Laryngoscope·2026

Same journal

Elevated BMI Is Not Associated With Adverse Outcomes in Open Airway Reconstruction.

The Laryngoscope·2026

Same journal

What is the Most Effective Treatment Approach for Vocal Fold Granuloma?

The Laryngoscope·2026

Same journal

ATP6V1B1-A Novel Genetic Association Between Pendred Imaging Phenotype and Renal Tubular Acidosis.

The Laryngoscope·2026

Same journal

Effects of Ferrostatin-1 on Vocal Folds in Aging Rats.

The Laryngoscope·2026

Same journal

What Is the Role of Uvulopalatopharyngoplasty in Contemporary Sleep Surgery?

The Laryngoscope·2026

See all related articles

Few-shot prompting with Google Gemini 2.5 Pro successfully assessed surgical skill in simulated injection laryngoplasty, distinguishing expert from trainee performance. Averaging repeat evaluations can mitigate model variability for this promising assessment tool.

Area of Science:

Artificial Intelligence in Medicine
Surgical Skill Assessment
Medical Simulation

Background:

Assessing surgical proficiency is critical for patient safety and training effectiveness.
Objective and reliable methods for evaluating procedural skills are needed.
Multimodal foundation models offer potential for automated performance analysis.

Purpose of the Study:

To evaluate the construct validity of Google Gemini 2.5 Pro for assessing simulated injection laryngoplasty.
To compare zero-shot versus few-shot prompting strategies for skill assessment.
To determine model reliability and stability in performance evaluation.

Main Methods:

Thirty simulated injection laryngoplasty videos were stratified by operator experience (novice, intermediate, expert).

Keywords:

artificial intelligence injection laryngoplasty laryngology skill assessment surgical education

More Related Videos

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Published on: May 18, 2019

Learning Modern Laryngeal Surgery in a Dissection Laboratory

Learning Modern Laryngeal Surgery in a Dissection Laboratory

Published on: March 18, 2020

Related Experiment Videos

Last Updated: May 27, 2026

Utilizing a 3D Printed Laparoscopic Nissen Fundoplication Model to Shorten a Resident's Learning Curve

Utilizing a 3D Printed Laparoscopic Nissen Fundoplication Model to Shorten a Resident's Learning Curve

Published on: August 15, 2025

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Manufacture of a Multi-Purpose Low-Cost Animal Bench-Model for Teaching Tracheostomy

Published on: May 18, 2019

Learning Modern Laryngeal Surgery in a Dissection Laboratory

Learning Modern Laryngeal Surgery in a Dissection Laboratory

Published on: March 18, 2020

Google Gemini 2.5 Pro evaluated videos using zero-shot and few-shot prompting strategies.

Model performance was compared against operator experience, with reliability assessed via 90 repeated trials.

Main Results:

Zero-shot prompting failed to discriminate between skill levels (Spearman's ρ = 0.12, p = 0.52).
Few-shot prompting showed strong correlation with experience (Spearman's ρ = 0.66, p = 0.0002) and stratified skill levels.
Few-shot model significantly differentiated experts from novices and intermediates, improving precision and reducing error.

Conclusions:

General-purpose multimodal models require calibration (e.g., few-shot prompting) for surgical judgment.
Few-shot prompting effectively calibrated Gemini 2.5 Pro to distinguish expert from trainee performance.
Model variability necessitates mitigation strategies, such as averaging repeated evaluations, for scalable assessment.