Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 21, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Fine-Tuning, Retrieval-Augmented Generation, and Hybrid Large Language Models for Postoperative Decision Support: A

Srinivasagam Prabha1, Bernardo Gabriele Collaco1, Cesar Abraham Gomez-Cabello1

  • 1Division of Plastic Surgery, Mayo Clinic in Florida, 4500 San Pablo Rd S, Jacksonville, US.

Journal of Medical Internet Research
|June 19, 2026
PubMed
Summary

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

RSVpreF: A Vaccine for the Respiratory Syncytial Virus.

The Annals of pharmacotherapy·2026
Same author

OpenClaw and Multi-Agent AI in Plastic Surgery.

Aesthetic surgery journal·2026
Same author

Risk Factors for 30-day Hospital Readmission After Hospital-at-Home Treatment of Acute Pyelonephritis.

The American journal of medicine·2026
Same author

Federated target trial emulation for time-to-event outcomes via POLARIS: Pooled-equivalent One-shot Likelihood Aggregation for Real-world Inference in Survival.

Research square·2026
Same author

Optimizing Retrieval-Augmented Generation (RAG) in clinical medicine: methods and performance evaluation.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

From "negative" trial to positive clinical impact: mitigating eligibility criteria-induced temporal selection bias in emulated clinical trials.

npj health systems·2026
Same journal

Correction: Call for Decision Support for Electrocardiographic Alarm Administration Among Neonatal Intensive Care Unit Staff: Multicenter, Cross-Sectional Survey.

Journal of medical Internet research·2026
Same journal

A Futures Framework for Clinical AI Governance: Anticipating Emerging Risks, Shifting Roles, and Regulatory Challenges.

Journal of medical Internet research·2026
Same journal

Using a Large Language Model to Support Thematic Analysis of Patient Experiences in Chronic Illness Management: Comparative Qualitative Study.

Journal of medical Internet research·2026
Same journal

Combined Internet-Based Cognitive Behavioral Therapy and Face-to-Face Physiotherapy in Primary Health Care for Chronic Widespread Pain: Randomized Controlled Trial.

Journal of medical Internet research·2026
Same journal

Operationalizing Digital Health Equity in Artificial Intelligence-Enabled Patient Decision Aids for Older Adults: Mixed Methods Study.

Journal of medical Internet research·2026
Same journal

Automated Prediction of Glasgow Coma Scale Scores From Unstructured Electronic Health Records Using Natural Language Processing: Development and Validation Study.

Journal of medical Internet research·2026
See all related articles
This summary is machine-generated.

Knowledge-enhanced large language models (LLMs) significantly improve postoperative decision support accuracy. The hybrid fine-tuning (FT) plus retrieval-augmented generation (RAG) approach demonstrated the highest performance, showing promise for patient education.

Area of Science:

  • Artificial Intelligence in Medicine
  • Clinical Decision Support Systems
  • Natural Language Processing

Background:

  • Large language models (LLMs) offer potential for clinical decision support but struggle with integrating domain-specific medical knowledge for tasks like postoperative patient education.
  • Challenges include maintaining accuracy, safety, and interpretability when adapting LLMs for healthcare applications.
  • Fine-tuning (FT), retrieval-augmented generation (RAG), and hybrid FT+RAG are key strategies for knowledge integration, yet their comparative efficacy in postoperative care is unevaluated.

Purpose of the Study:

  • To compare the performance, reliability, and safety of baseline, FT, RAG, and hybrid FT+RAG LLM configurations for postoperative decision support.
  • To evaluate LLM accuracy in routine and emergency postoperative scenarios.
  • To assess the impact of knowledge integration strategies on LLM performance in a clinical context.

Related Experiment Videos

Last Updated: Jun 21, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Main Methods:

  • A comparative evaluation of four LLM configurations (baseline, FT, RAG, FT+RAG) using Google Gemini 2.5 Flash.
  • Model adaptation and validation using 600 postoperative question-answer pairs, with final evaluation on 150 queries including routine, emergency, and out-of-scope prompts.
  • Independent assessment by 3 blinded clinical experts for accuracy, safety, completeness, and relevance, supplemented by automated metrics for readability and hallucination.

Main Results:

  • All knowledge-enhanced LLMs significantly outperformed the baseline model in overall accuracy (FT: 92.7%, RAG: 91.3%, FT+RAG: 97.3% vs. baseline: 68.0%).
  • The FT+RAG configuration achieved the highest clinical medical accuracy (96.7%) for in-scope queries and demonstrated superior composite classification performance (100% precision, 96.7% recall, 98.3% F1 score).
  • Knowledge-enhanced models showed improved safety/refusal accuracy, though baseline comparisons were influenced by differing safety instructions; readability was generally lower due to safety boilerplate.

Conclusions:

  • Incorporating domain-specific knowledge via FT, RAG, or FT+RAG substantially improves LLM performance for postoperative decision support compared to baseline models.
  • The hybrid FT+RAG approach yielded the most favorable outcomes, indicating its potential for enhancing postoperative patient education and decision-making.
  • Further validation, readability optimization, and robust governance are essential before widespread patient-facing deployment of knowledge-enhanced LLMs in healthcare.