Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Edoardo Leo1, Francesco Baglivo1, Federico Starace1

  • 1Dipartimento di Ricerca Traslazionale e delle Nuove Tecnologie in Medicina e Chirurgia, Università di Pisa.

Recenti Progressi in Medicina
|October 2, 2025
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Cytology and <i>KRAS/GNAS</i> Molecular Testing of Pancreatic Cyst Fluid for Risk Stratification of Intraductal Papillary Mucinous Neoplasms: A Single-Center Study with Histological Correlation.

Journal of clinical medicine·2026
Same author

Reliable ECG monitoring during rest and exercise: a pilot comparative validation of a wearable single-lead band.

European heart journal. Imaging methods and practice·2026
Same author

Mapping risk communication practices in public health emergencies: a scoping review and comparison with Italian regional pandemic plans.

BMC public health·2026
Same author

Comparative efficacy of agomelatine and escitalopram in people with epilepsy and comorbid major depressive disorder: A double-blind randomized controlled trial.

Epilepsy & behavior reports·2026
Same author

Impact of Respiratory Viral Codetections on RSV Disease Burden in Young Children in Primary Care.

Influenza and other respiratory viruses·2026
Same author

Clinical outcomes of rehabilitation with a robotic anthropomorphic exoskeleton in patients with motor-incomplete spinal cord injury: a multicenter randomized controlled trial.

European journal of physical and rehabilitation medicine·2026
Same journal

Recenti progressi in medicina·2026
Same journal

Recenti progressi in medicina·2026
Same journal

Recenti progressi in medicina·2026
Same journal

Recenti progressi in medicina·2026
Same journal

Recenti progressi in medicina·2026
Same journal

Recenti progressi in medicina·2026
See all related articles

Retrieval Augmented Generation (RAG) enhanced large language model (LLM) accuracy on Sleep Medicine certification questions. This study demonstrates RAG

Area of Science:

  • Medical Informatics
  • Artificial Intelligence in Medicine
  • Sleep Medicine

Background:

  • Large language models (LLMs) show promise in medical education.
  • Evaluating LLM performance in specialized domains like Sleep Medicine is crucial.
  • Current LLM accuracy may be insufficient for high-stakes medical certification.

Purpose of the Study:

  • To assess the performance of four LLMs on Sleep Medicine certification questions.
  • To compare baseline LLM performance with Retrieval Augmented Generation (RAG) enhanced performance.
  • To evaluate the impact of RAG on LLM reliability in a specialized medical context.

Main Methods:

  • Utilized Sleep Medicine guidelines and textbook as the knowledge base.
  • Evaluated four LLMs: Llama 3.2 3B, Llama 3.3 70B, GPT 4o mini, and Gemini 2.0 Flash.

Related Experiment Videos

  • Compared baseline performance against RAG-enhanced performance on AIMS certification questions.
  • Main Results:

    • RAG significantly improved accuracy across all tested LLMs.
    • Llama 3.2 showed a +9.6 point increase in accuracy with RAG.
    • Gemini 2.0 demonstrated a +4.0 point increase in accuracy with RAG.

    Conclusions:

    • RAG is effective in enhancing LLM accuracy for specialized medical knowledge.
    • LLM performance in Sleep Medicine certification can be improved using RAG.
    • RAG integration is key to increasing LLM reliability in medical domains.