Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

15.4K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
15.4K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.8K
3.8K
Issues And Trends In Healthcare Delivery System01:29

Issues And Trends In Healthcare Delivery System

6.4K
The issues and trends in healthcare delivery are constantly changing. The COVID-19 pandemic is one recent issue that wreaked havoc on healthcare systems, causing a shortage of healthcare workers, high demand for medicines and supplies, and increased medical expenditure due to a lack of insurance. Other issues include rising healthcare costs and care fragmentation.
Cost Containment
Payment for healthcare services has historically promoted adoption of costly and often unnecessary or inefficient...
6.4K
ER Retrieval Pathway01:45

ER Retrieval Pathway

5.0K
In the secretory pathway, vesicles transport proteins from one cellular compartment to another in forward transport to deliver the protein to its correct location. Occasionally, misfolded proteins and incorrect proteins escape their original compartments, and a retrieval pathway is used to return the escaped proteins to their original compartment.
The ER uses many checkpoints to prevent the entry of incorrectly folded or a resident protein as cargo onto a transport vesicle. These mechanisms...
5.0K
Nursing Clinical Information System01:27

Nursing Clinical Information System

1.4K
Nursing Clinical Information System (NCIS)
A Nursing Clinical Information System (NCIS) is a specialized type of healthcare information system tailored to meet the unique needs of nursing practice. It incorporates the principles of nursing informatics to streamline information management and improve the quality of care delivery.
Critical attributes of NCIS include:
1.4K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Recent Advances in Automated Mitosis Detection in Digital Pathology: A PRISMA-Guided Systematic Review with Evaluation-Regime Stratification (2018-2025).

Biomedicines·2026
Same author

Automated Pretreatment Thoracic CT-Based Body Composition Analysis Predicts Progression-Free Survival in Head and Neck Cancer.

Journal of clinical medicine·2026
Same author

From Flow to Feature Using a Proof-of-Concept Spectral-Driven Machine Learning Approach Using Smart Urinary and Drainage Catheter Systems: Algorithm Development and Validation.

JMIR medical informatics·2026
Same author

The Effect of Contrast Media Formulations with Different Iodine Preparation Concentrations at a Constant Iodine Delivery Rate in Low-kV CT Angiography: An Experimental Animal Study.

RoFo : Fortschritte auf dem Gebiete der Rontgenstrahlen und der Nuklearmedizin·2026
Same author

Fundamentals of big data and artificial intelligence in transfusion medicine.

Vox sanguinis·2026
Same author

International testing and refinement of AI algorithms predicting acute leukemia subtypes from routine laboratory data.

Nature communications·2026
Same journal

Correction: Call for Decision Support for Electrocardiographic Alarm Administration Among Neonatal Intensive Care Unit Staff: Multicenter, Cross-Sectional Survey.

Journal of medical Internet research·2026
Same journal

A Futures Framework for Clinical AI Governance: Anticipating Emerging Risks, Shifting Roles, and Regulatory Challenges.

Journal of medical Internet research·2026
Same journal

Using a Large Language Model to Support Thematic Analysis of Patient Experiences in Chronic Illness Management: Comparative Qualitative Study.

Journal of medical Internet research·2026
Same journal

Combined Internet-Based Cognitive Behavioral Therapy and Face-to-Face Physiotherapy in Primary Health Care for Chronic Widespread Pain: Randomized Controlled Trial.

Journal of medical Internet research·2026
Same journal

Operationalizing Digital Health Equity in Artificial Intelligence-Enabled Patient Decision Aids for Older Adults: Mixed Methods Study.

Journal of medical Internet research·2026
Same journal

Automated Prediction of Glasgow Coma Scale Scores From Unstructured Electronic Health Records Using Natural Language Processing: Development and Validation Study.

Journal of medical Internet research·2026
See all related articles

Related Experiment Video

Updated: Mar 27, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K

Improving Retrieval Augmented Generation for Health Care by Fine-Tuning Clinical Embedding Models: Development and

Kamyar Arzideh1,2, Henning Schäfer1,3, Ahmad Idrissi-Yaghir1,4

  • 1Institute for Artificial Intelligence in Medicine,, University Hospital Essen, Essen, Germany.

Journal of Medical Internet Research
|March 25, 2026
PubMed
Summary
This summary is machine-generated.

Domain-specific embedding models were developed using real-world clinical data to enhance medical information retrieval (IR) and Retrieval Augmented Generation (RAG) systems. These models improve context retrieval accuracy in healthcare settings, outperforming existing general-purpose models.

Keywords:
LLMNLPRAGRetrieval Augmented Generationinformation retrievallarge language modelsnatural language processing

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.8K

Related Experiment Videos

Last Updated: Mar 27, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.3K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.8K

Area of Science:

  • Medical Informatics
  • Natural Language Processing
  • Artificial Intelligence in Healthcare

Background:

  • Existing embedding models for Retrieval Augmented Generation (RAG) are primarily trained on English data, limiting their use in non-English healthcare.
  • These models often lack training on real-world clinical documents, leading to inaccurate context retrieval in specialized medical settings.
  • Domain-specific terminology, abbreviations, and nuanced language in clinical documents pose challenges for general embedding models.

Purpose of the Study:

  • To develop and validate embedding models specifically trained on real-world clinical documents.
  • To improve medical information retrieval (IR) and RAG system performance in both German and English contexts.
  • To address limitations of general embedding models in specialized healthcare documentation.

Main Methods:

  • Fine-tuned sentence transformers using the multilingual-e5-large architecture.
  • Generated ~11 million synthetic question-answer pairs from 400,000 clinical documents.
  • Utilized SauerkrautLM-SOLAR-Instruct LLM for question-answer generation and translated data to English.

Main Results:

  • The fine-tuned model achieved a mAP@100 of 0.27 in IR tasks, outperforming baselines (multilingual-e5-large: 0.14, bge-m3: 0.11).
  • Demonstrated robust RAG performance, comparable to baselines in patient-centered scenarios and moderate improvements in cross-patient settings.
  • Models trained on pseudonymized data showed strong retrieval performance and high contextual precision.

Conclusions:

  • Developed and validated domain-specific embedding models using real-world clinical data and LLM-generated synthetic data.
  • These models enhance medical IR and RAG applications, particularly in specialized healthcare contexts.
  • Published models offer a reproducible framework for improving medical data retrieval in diverse healthcare institutions.