Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

This article describes RUGGED (Retrieval Under Graph-Guided Explainable disease Distinction), which integrates Large Language Model (LLM) inference with Retrieval-Augmented Generation (RAG). It draws evidence from expert-curated biomedical knowledge bases and peer-reviewed biomedical publications to synthesize new knowledge from up-to-date information, identify explainable and actionable predictions, and pinpoint promising directions for hypothesis-driven...

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A computational protocol, CaseOLAP LIFT, and a use case are presented for investigating mitochondrial proteins and their associations with cardiovascular disease as described in biomedical reports. This protocol can be easily adapted to study user-selected cellular components and...

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

In this protocol, foundation large language model response quality is improved via augmentation with peer-reviewed, domain-specific scientific articles through a vector embedding mechanism. Additionally, code is provided to aid in performance comparison across large language...

The (Spatial) Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition

The (Spatial) Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition

We present a protocol to explore the relationship between spatial language production, spatial memory, and object knowledge. The procedure allows experimental manipulation of, and control over, conditions of object knowledge, language at instruction, and physical location, thus teasing apart cognitive and linguistic models describing interactions between these...

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Existing algorithms generate one solution for a biomarker detection dataset. This protocol demonstrates the existence of multiple similarly effective solutions and presents a user-friendly software to help biomedical researchers investigate their datasets for the proposed challenge. Computer scientists may also provide this feature in their biomarker detection...

Retrieval

Retrieval

Retrieval is the process of getting information out of memory storage and back into conscious awareness. This ability is essential for daily tasks like brushing hair and teeth, driving to work, and performing job duties. Retrieval occurs in three ways: recall, recognition, and relearning.
Recall involves accessing information without cues, such as during an essay test, where individuals must retrieve facts and concepts from memory unaided. Another example is remembering the name of a colleague...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Desiderata for a biomedical knowledge network: opportunities, challenges and future directions.

Bioinformatics advances·2026

Same author

A case-based explainable graph neural network framework for mechanistic drug repositioning.

Bioinformatics (Oxford, England)·2026

Same author

The NIAID Discovery Portal: a unified search engine for infectious and immune-mediated disease datasets.

mSystems·2025

Same author

Federated Knowledge Retrieval Elevates Large Language Model Performance on Biomedical Benchmarks.

bioRxiv : the preprint server for biology·2025

Same author

Announcing the Biomedical Data Translator: Initial Public Release.

Clinical and translational science·2025

Same author

Drug Repurposing using consilience of Knowledge Graph Completion methods.

bioRxiv : the preprint server for biology·2024

Same journal

NanoporeDB: A Structural Resource Of Multimeric Protein Nanopores For Single-Molecule Sensing.

GigaScience·2026

Same journal

From the Brain Cell Atlas to Precision Neurology: A review of the application of AI-driven multi-omics in brain science.

GigaScience·2026

Same journal

Comparison of Deep Learning Approaches for Extreme Low-SNR Image Restoration.

GigaScience·2026

Same journal

ScopeViewer: A Browser-Based Solution for Visualizing Large Biological Images.

GigaScience·2026

Same journal

ChatMDV: Reducing Technical Barriers in Bioinformatics Analysis using Large Language Models.

GigaScience·2026

Same journal

ClusterGraph: a new tool for visualisation and compression of multidimensional data.

GigaScience·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 20, 2026

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Federated knowledge retrieval elevates large language model performance on biomedical benchmarks.

Janet Joy¹, Andrew I Su¹

¹Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA.

|January 19, 2026

Summary

This summary is machine-generated.

Retrieval-augmented generation using BioThings Explorer (BTE-RAG) enhances large language model (LLM) accuracy in biomedical research. This framework improves factual correctness and mechanistic exploration for drug discovery and translational science.

Keywords:

BioThings Explorer DrugMechDB benchmarking Retrieval-augmented generation (RAG)biomedical knowledge graphs federated knowledge retrieval hallucination mitigation large language models (LLMs)mechanistic reasoning

More Related Videos

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Related Experiment Videos

Last Updated: Jan 20, 2026

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Biomedical research
Artificial intelligence
Knowledge representation

Background:

Large language models (LLMs) offer advanced natural language processing for biomedical research.
LLMs can produce factual inaccuracies (hallucinations) due to reliance on implicit data.
These inaccuracies pose risks in critical biomedical applications.

Purpose of the Study:

To develop a framework that improves LLM accuracy in biomedical research.
To integrate explicit mechanistic evidence with LLM reasoning.
To enhance factual accuracy and reduce hallucinations in LLM outputs.

Main Methods:

Developed BTE-RAG, a retrieval-augmented generation framework.
Integrated LLM reasoning with explicit evidence from BioThings Explorer (API federation).
Evaluated BTE-RAG against LLM-only methods on three custom benchmark datasets (gene mechanisms, metabolite effects, drug-biological processes).

Main Results:

BTE-RAG significantly improved accuracy on gene-centric tasks (e.g., GPT-4o accuracy increased from 69.8% to 78.6%).
Enhanced response quality for metabolite effects (e.g., 82% increase in high cosine similarity for GPT-4o mini).
Improved answer concordance for drug-biological process relationships and outperformed alternative models on gene-disease association benchmarks.

Conclusions:

Federated knowledge retrieval via BTE-RAG offers transparent accuracy improvements for LLMs.
BTE-RAG is a practical tool for mechanistic exploration in biomedical research.
The framework supports translational biomedical research by enhancing LLM reliability.