Preparing for Vascular Surgery Board Certification: A Comparative Study Using Large Language Models

  • 0Vascular Surgery, Ross University School of Medicine, Miramar, USA.
Cureus +

|

Abstract

Introduction and aim Large language models (LLMs) are transforming medical education by offering innovative methods to enhance teaching and learning. Despite their demonstrated potential, research on its use in vascular surgery is limited. This study aimed to evaluate and compare the effectiveness of LLM in preparing for vascular surgery board certification exams, exploring their potential as educational supplements. Methods We selected 269 text-only multiple-choice questions of 642 from the Vascular Education and Self-Assessment Program (VESAP) version 6. We excluded 143 image-based questions. One independent reviewer input questions into the following four AI tools: ChatGPT 3.5 (San Francisco, CA: OpenAI), Google Gemini (London, UK: Google DeepMind), Microsoft Bing (Redmond, WA: Microsoft), and Claude 3.5 (San Francisco, CA: Anthropic Inc.). Each question with answer choices was entered into an incognito window of the AI tools without any context. A chi-square test was used to assess if the percentage of correct answers varied by question difficulty and discipline, with a significance level of p<0.05. Data analysis was conducted using Stata 18.5 (StataCorp LLC: College Station, TX). Results Claude 3.5 achieved the highest overall accuracy with 65.7% correct responses, outperforming Google Gemini (55.3%), ChatGPT (55.0%), and Microsoft Bing (53.9%). While ChatGPT, Google Gemini, and Microsoft Bing did not show significant accuracy variations by discipline (p=0.548, p=0.145, and p=0.797, respectively), Claude 3.5 demonstrated significant performance differences across disciplines (p=0.001), mastering lower extremity (86%), dialysis access (80%), cerebrovascular (77%), venous lymph (70%), and vascular medicine (68.9%). Conclusion Claude 3.5 outperformed other LLMs in solving Vascular Surgery Qualifying Examination version 6 (VSQE6) questions and shows promise as a supplementary tool in vascular surgery education. LLMs are well-versed in the topics of lower extremity vascular issues, dialysis access, and cerebrovascular conditions. At this time, current LLM capabilities do not fully meet the evolving needs of vascular surgery education. While traditional methods remain essential for vascular surgery, updated models of LLMs may provide more substantial benefits in the future.

Related Concept Videos