Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Issues And Trends In Healthcare Delivery System

Issues And Trends In Healthcare Delivery System

The issues and trends in healthcare delivery are constantly changing. The COVID-19 pandemic is one recent issue that wreaked havoc on healthcare systems, causing a shortage of healthcare workers, high demand for medicines and supplies, and increased medical expenditure due to a lack of insurance. Other issues include rising healthcare costs and care fragmentation.
Cost Containment
Payment for healthcare services has historically promoted adoption of costly and often unnecessary or inefficient...

ER Retrieval Pathway

ER Retrieval Pathway

In the secretory pathway, vesicles transport proteins from one cellular compartment to another in forward transport to deliver the protein to its correct location. Occasionally, misfolded proteins and incorrect proteins escape their original compartments, and a retrieval pathway is used to return the escaped proteins to their original compartment.
The ER uses many checkpoints to prevent the entry of incorrectly folded or a resident protein as cargo onto a transport vesicle. These mechanisms...

Nursing Clinical Information System

Nursing Clinical Information System

Nursing Clinical Information System (NCIS)
A Nursing Clinical Information System (NCIS) is a specialized type of healthcare information system tailored to meet the unique needs of nursing practice. It incorporates the principles of nursing informatics to streamline information management and improve the quality of care delivery.
Critical attributes of NCIS include:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Recent Advances in Automated Mitosis Detection in Digital Pathology: A PRISMA-Guided Systematic Review with Evaluation-Regime Stratification (2018-2025).

Biomedicines·2026

Same author

Automated Pretreatment Thoracic CT-Based Body Composition Analysis Predicts Progression-Free Survival in Head and Neck Cancer.

Journal of clinical medicine·2026

Same author

From Flow to Feature Using a Proof-of-Concept Spectral-Driven Machine Learning Approach Using Smart Urinary and Drainage Catheter Systems: Algorithm Development and Validation.

JMIR medical informatics·2026

Same author

The Effect of Contrast Media Formulations with Different Iodine Preparation Concentrations at a Constant Iodine Delivery Rate in Low-kV CT Angiography: An Experimental Animal Study.

RoFo : Fortschritte auf dem Gebiete der Rontgenstrahlen und der Nuklearmedizin·2026

Same author

Fundamentals of big data and artificial intelligence in transfusion medicine.

Vox sanguinis·2026

Same author

International testing and refinement of AI algorithms predicting acute leukemia subtypes from routine laboratory data.

Nature communications·2026

Same journal

Correction: Call for Decision Support for Electrocardiographic Alarm Administration Among Neonatal Intensive Care Unit Staff: Multicenter, Cross-Sectional Survey.

Journal of medical Internet research·2026

Same journal

A Futures Framework for Clinical AI Governance: Anticipating Emerging Risks, Shifting Roles, and Regulatory Challenges.

Journal of medical Internet research·2026

Same journal

Using a Large Language Model to Support Thematic Analysis of Patient Experiences in Chronic Illness Management: Comparative Qualitative Study.

Journal of medical Internet research·2026

Same journal

Combined Internet-Based Cognitive Behavioral Therapy and Face-to-Face Physiotherapy in Primary Health Care for Chronic Widespread Pain: Randomized Controlled Trial.

Journal of medical Internet research·2026

Same journal

Operationalizing Digital Health Equity in Artificial Intelligence-Enabled Patient Decision Aids for Older Adults: Mixed Methods Study.

Journal of medical Internet research·2026

Same journal

Automated Prediction of Glasgow Coma Scale Scores From Unstructured Electronic Health Records Using Natural Language Processing: Development and Validation Study.

Journal of medical Internet research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Mar 27, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Improving Retrieval Augmented Generation for Health Care by Fine-Tuning Clinical Embedding Models: Development and

Kamyar Arzideh^1,2, Henning Schäfer^1,3, Ahmad Idrissi-Yaghir^1,4

¹Institute for Artificial Intelligence in Medicine,, University Hospital Essen, Essen, Germany.

Journal of Medical Internet Research

|March 25, 2026

Summary

This summary is machine-generated.

Domain-specific embedding models were developed using real-world clinical data to enhance medical information retrieval (IR) and Retrieval Augmented Generation (RAG) systems. These models improve context retrieval accuracy in healthcare settings, outperforming existing general-purpose models.

Keywords:

LLM NLP RAG Retrieval Augmented Generation information retrieval large language models natural language processing

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Related Experiment Videos

Last Updated: Mar 27, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Area of Science:

Medical Informatics
Natural Language Processing
Artificial Intelligence in Healthcare

Background:

Existing embedding models for Retrieval Augmented Generation (RAG) are primarily trained on English data, limiting their use in non-English healthcare.
These models often lack training on real-world clinical documents, leading to inaccurate context retrieval in specialized medical settings.
Domain-specific terminology, abbreviations, and nuanced language in clinical documents pose challenges for general embedding models.

Purpose of the Study:

To develop and validate embedding models specifically trained on real-world clinical documents.
To improve medical information retrieval (IR) and RAG system performance in both German and English contexts.
To address limitations of general embedding models in specialized healthcare documentation.

Main Methods:

Fine-tuned sentence transformers using the multilingual-e5-large architecture.
Generated ~11 million synthetic question-answer pairs from 400,000 clinical documents.
Utilized SauerkrautLM-SOLAR-Instruct LLM for question-answer generation and translated data to English.

Main Results:

The fine-tuned model achieved a mAP@100 of 0.27 in IR tasks, outperforming baselines (multilingual-e5-large: 0.14, bge-m3: 0.11).
Demonstrated robust RAG performance, comparable to baselines in patient-centered scenarios and moderate improvements in cross-patient settings.
Models trained on pseudonymized data showed strong retrieval performance and high contextual precision.

Conclusions:

Developed and validated domain-specific embedding models using real-world clinical data and LLM-generated synthetic data.
These models enhance medical IR and RAG applications, particularly in specialized healthcare contexts.
Published models offer a reproducible framework for improving medical data retrieval in diverse healthcare institutions.