Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

1.3K
This article describes RUGGED (Retrieval Under Graph-Guided Explainable disease Distinction), which integrates Large Language Model (LLM) inference with Retrieval-Augmented Generation (RAG). It draws evidence from expert-curated biomedical knowledge bases and peer-reviewed biomedical publications to synthesize new knowledge from up-to-date information, identify explainable and actionable predictions, and pinpoint promising directions for hypothesis-driven...
1.3K
A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

2.1K
A computational protocol, CaseOLAP LIFT, and a use case are presented for investigating mitochondrial proteins and their associations with cardiovascular disease as described in biomedical reports. This protocol can be easily adapted to study user-selected cellular components and...
2.1K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

1.0K
In this protocol, foundation large language model response quality is improved via augmentation with peer-reviewed, domain-specific scientific articles through a vector embedding mechanism. Additionally, code is provided to aid in performance comparison across large language...
1.0K
The (Spatial) Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition05:15

The (Spatial) Memory Game: Testing the Relationship Between Spatial Language, Object Knowledge, and Spatial Cognition

11.3K
We present a protocol to explore the relationship between spatial language production, spatial memory, and object knowledge. The procedure allows experimental manipulation of, and control over, conditions of object knowledge, language at instruction, and physical location, thus teasing apart cognitive and linguistic models describing interactions between these...
11.3K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

8.0K
Existing algorithms generate one solution for a biomarker detection dataset. This protocol demonstrates the existence of multiple similarly effective solutions and presents a user-friendly software to help biomedical researchers investigate their datasets for the proposed challenge. Computer scientists may also provide this feature in their biomarker detection...
8.0K
Retrieval01:12

Retrieval

418
Retrieval is the process of getting information out of memory storage and back into conscious awareness. This ability is essential for daily tasks like brushing hair and teeth, driving to work, and performing job duties. Retrieval occurs in three ways: recall, recognition, and relearning.
Recall involves accessing information without cues, such as during an essay test, where individuals must retrieve facts and concepts from memory unaided. Another example is remembering the name of a colleague...
418

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Desiderata for a biomedical knowledge network: opportunities, challenges and future directions.

Bioinformatics advances·2026
Same author

A case-based explainable graph neural network framework for mechanistic drug repositioning.

Bioinformatics (Oxford, England)·2026
Same author

The NIAID Discovery Portal: a unified search engine for infectious and immune-mediated disease datasets.

mSystems·2025
Same author

Federated Knowledge Retrieval Elevates Large Language Model Performance on Biomedical Benchmarks.

bioRxiv : the preprint server for biology·2025
Same author

Announcing the Biomedical Data Translator: Initial Public Release.

Clinical and translational science·2025
Same author

Drug Repurposing using consilience of Knowledge Graph Completion methods.

bioRxiv : the preprint server for biology·2024
Same journal

NanoporeDB: A Structural Resource Of Multimeric Protein Nanopores For Single-Molecule Sensing.

GigaScience·2026
Same journal

From the Brain Cell Atlas to Precision Neurology: A review of the application of AI-driven multi-omics in brain science.

GigaScience·2026
Same journal

Comparison of Deep Learning Approaches for Extreme Low-SNR Image Restoration.

GigaScience·2026
Same journal

ScopeViewer: A Browser-Based Solution for Visualizing Large Biological Images.

GigaScience·2026
Same journal

ChatMDV: Reducing Technical Barriers in Bioinformatics Analysis using Large Language Models.

GigaScience·2026
Same journal

ClusterGraph: a new tool for visualisation and compression of multidimensional data.

GigaScience·2026
See all related articles

Related Experiment Video

Updated: Jan 20, 2026

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K

Federated knowledge retrieval elevates large language model performance on biomedical benchmarks.

Janet Joy1, Andrew I Su1

  • 1Department of Integrative Structural and Computational Biology, Scripps Research, 10550 N Torrey Pines Rd, La Jolla, CA, 92037, USA.

Gigascience
|January 19, 2026
PubMed
Summary
This summary is machine-generated.

Retrieval-augmented generation using BioThings Explorer (BTE-RAG) enhances large language model (LLM) accuracy in biomedical research. This framework improves factual correctness and mechanistic exploration for drug discovery and translational science.

Keywords:
BioThings ExplorerDrugMechDB benchmarkingRetrieval-augmented generation (RAG)biomedical knowledge graphsfederated knowledge retrievalhallucination mitigationlarge language models (LLMs)mechanistic reasoning

More Related Videos

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports
07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

2.1K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Related Experiment Videos

Last Updated: Jan 20, 2026

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports
07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

2.1K
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Area of Science:

  • Biomedical research
  • Artificial intelligence
  • Knowledge representation

Background:

  • Large language models (LLMs) offer advanced natural language processing for biomedical research.
  • LLMs can produce factual inaccuracies (hallucinations) due to reliance on implicit data.
  • These inaccuracies pose risks in critical biomedical applications.

Purpose of the Study:

  • To develop a framework that improves LLM accuracy in biomedical research.
  • To integrate explicit mechanistic evidence with LLM reasoning.
  • To enhance factual accuracy and reduce hallucinations in LLM outputs.

Main Methods:

  • Developed BTE-RAG, a retrieval-augmented generation framework.
  • Integrated LLM reasoning with explicit evidence from BioThings Explorer (API federation).
  • Evaluated BTE-RAG against LLM-only methods on three custom benchmark datasets (gene mechanisms, metabolite effects, drug-biological processes).

Main Results:

  • BTE-RAG significantly improved accuracy on gene-centric tasks (e.g., GPT-4o accuracy increased from 69.8% to 78.6%).
  • Enhanced response quality for metabolite effects (e.g., 82% increase in high cosine similarity for GPT-4o mini).
  • Improved answer concordance for drug-biological process relationships and outperformed alternative models on gene-disease association benchmarks.

Conclusions:

  • Federated knowledge retrieval via BTE-RAG offers transparent accuracy improvements for LLMs.
  • BTE-RAG is a practical tool for mechanistic exploration in biomedical research.
  • The framework supports translational biomedical research by enhancing LLM reliability.