Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Video

Updated: Jun 21, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Fine-Tuning, Retrieval-Augmented Generation, and Hybrid Large Language Models for Postoperative Decision Support: A

Srinivasagam Prabha¹, Bernardo Gabriele Collaco¹, Cesar Abraham Gomez-Cabello¹

¹Division of Plastic Surgery, Mayo Clinic in Florida, 4500 San Pablo Rd S, Jacksonville, US.

Journal of Medical Internet Research

|June 19, 2026

Summary

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

RSVpreF: A Vaccine for the Respiratory Syncytial Virus.

The Annals of pharmacotherapy·2026

Same author

OpenClaw and Multi-Agent AI in Plastic Surgery.

Aesthetic surgery journal·2026

Same author

Risk Factors for 30-day Hospital Readmission After Hospital-at-Home Treatment of Acute Pyelonephritis.

The American journal of medicine·2026

Same author

Federated target trial emulation for time-to-event outcomes via POLARIS: Pooled-equivalent One-shot Likelihood Aggregation for Real-world Inference in Survival.

Research square·2026

Same author

Optimizing Retrieval-Augmented Generation (RAG) in clinical medicine: methods and performance evaluation.

Journal of the American Medical Informatics Association : JAMIA·2026

Same author

From "negative" trial to positive clinical impact: mitigating eligibility criteria-induced temporal selection bias in emulated clinical trials.

npj health systems·2026

Same journal

Correction: Call for Decision Support for Electrocardiographic Alarm Administration Among Neonatal Intensive Care Unit Staff: Multicenter, Cross-Sectional Survey.

Journal of medical Internet research·2026

Same journal

A Futures Framework for Clinical AI Governance: Anticipating Emerging Risks, Shifting Roles, and Regulatory Challenges.

Journal of medical Internet research·2026

Same journal

Using a Large Language Model to Support Thematic Analysis of Patient Experiences in Chronic Illness Management: Comparative Qualitative Study.

Journal of medical Internet research·2026

Same journal

Combined Internet-Based Cognitive Behavioral Therapy and Face-to-Face Physiotherapy in Primary Health Care for Chronic Widespread Pain: Randomized Controlled Trial.

Journal of medical Internet research·2026

Same journal

Operationalizing Digital Health Equity in Artificial Intelligence-Enabled Patient Decision Aids for Older Adults: Mixed Methods Study.

Journal of medical Internet research·2026

Same journal

Automated Prediction of Glasgow Coma Scale Scores From Unstructured Electronic Health Records Using Natural Language Processing: Development and Validation Study.

Journal of medical Internet research·2026

See all related articles

This summary is machine-generated.

Knowledge-enhanced large language models (LLMs) significantly improve postoperative decision support accuracy. The hybrid fine-tuning (FT) plus retrieval-augmented generation (RAG) approach demonstrated the highest performance, showing promise for patient education.

Area of Science:

Artificial Intelligence in Medicine
Clinical Decision Support Systems
Natural Language Processing

Background:

Large language models (LLMs) offer potential for clinical decision support but struggle with integrating domain-specific medical knowledge for tasks like postoperative patient education.
Challenges include maintaining accuracy, safety, and interpretability when adapting LLMs for healthcare applications.
Fine-tuning (FT), retrieval-augmented generation (RAG), and hybrid FT+RAG are key strategies for knowledge integration, yet their comparative efficacy in postoperative care is unevaluated.

Purpose of the Study:

To compare the performance, reliability, and safety of baseline, FT, RAG, and hybrid FT+RAG LLM configurations for postoperative decision support.
To evaluate LLM accuracy in routine and emergency postoperative scenarios.
To assess the impact of knowledge integration strategies on LLM performance in a clinical context.

Related Experiment Videos

Last Updated: Jun 21, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Main Methods:

A comparative evaluation of four LLM configurations (baseline, FT, RAG, FT+RAG) using Google Gemini 2.5 Flash.
Model adaptation and validation using 600 postoperative question-answer pairs, with final evaluation on 150 queries including routine, emergency, and out-of-scope prompts.
Independent assessment by 3 blinded clinical experts for accuracy, safety, completeness, and relevance, supplemented by automated metrics for readability and hallucination.

Main Results:

All knowledge-enhanced LLMs significantly outperformed the baseline model in overall accuracy (FT: 92.7%, RAG: 91.3%, FT+RAG: 97.3% vs. baseline: 68.0%).
The FT+RAG configuration achieved the highest clinical medical accuracy (96.7%) for in-scope queries and demonstrated superior composite classification performance (100% precision, 96.7% recall, 98.3% F1 score).
Knowledge-enhanced models showed improved safety/refusal accuracy, though baseline comparisons were influenced by differing safety instructions; readability was generally lower due to safety boilerplate.

Conclusions:

Incorporating domain-specific knowledge via FT, RAG, or FT+RAG substantially improves LLM performance for postoperative decision support compared to baseline models.
The hybrid FT+RAG approach yielded the most favorable outcomes, indicating its potential for enhancing postoperative patient education and decision-making.
Further validation, readability optimization, and robust governance are essential before widespread patient-facing deployment of knowledge-enhanced LLMs in healthcare.