Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Implementing trust in non-small cell lung cancer diagnosis with a conformalized uncertainty-aware AI framework.

Nature biomedical engineering·2026
Same author

Generating synthetic multi-national longitudinal cohorts for clinically grounded HIV research.

Nature communications·2026
Same author

A real-world feasibility evaluation of LLM-based clinical prediction: emergency department return visit admission across two academic medical centers.

Research square·2026
Same author

Evaluating Large Language Models for Translating Multimodal Phenotype Documentations into Executable EHR Phenotyping Algorithms.

medRxiv : the preprint server for health sciences·2026
Same author

Integrating genetically predicted transcriptomic signatures with longitudinal real-world data enables scalable drug repurposing for Alzheimer's disease.

Research square·2026
Same author

Making LLM Predictions Interpretable: Fine-Tuning GPT-4o for Early Discontinuation of Cancer Medication.

Studies in health technology and informatics·2026
Same journal

HIV Transmission Dynamics in Greater Mexico City are Shaped by Dense Spatial Mixing.

Research square·2026
Same journal

A UCP1-IRES-Cre Knock-In Mouse Enables Specific Brown Adipocyte Targeting Without CNS Off-Target Expression.

Research square·2026
Same journal

Precision RNAi for Fibrodysplasia Ossificans Progressiva: a combinatorial, unimolecular, allele selective approach.

Research square·2026
Same journal

Perceptions of end-of-life care quality among bereaved closest contacts of community-dwelling older Australians: a cross-sectional survey of the ASPREE cohort.

Research square·2026
Same journal

Heavy-chain immune repertoire sequencing enables language-model prediction of antigen-specific antibodies.

Research square·2026
Same journal

25+ Years of TRPV4: From Discovery to Translational Horizons.

Research square·2026
See all related articles

Related Experiment Video

Updated: Jun 6, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evaluating Large Language Models for Translating Multimodal Phenotype Documentations into Executable EHR Phenotyping

Chao Yan1, Yi Xin2, Wu-Chen Su1

  • 1Vanderbilt University Medical Center.

Research Square
|June 5, 2026
PubMed
Summary
This summary is machine-generated.

Translating clinical definitions into electronic health record (EHR) database queries is hard. Large language models show promise but struggle with diagrams, highlighting documentation as the main challenge.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Related Experiment Videos

Last Updated: Jun 6, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Area of Science:

  • Health Informatics
  • Artificial Intelligence in Medicine
  • Clinical Data Management

Background:

  • Electronic Health Record (EHR) phenotypes are crucial for research.
  • Translating clinical definitions into EHR database queries is a complex and time-consuming task.
  • Large language models (LLMs) offer potential solutions for automating this process.

Purpose of the Study:

  • To evaluate the performance of two advanced large language models in translating clinical definitions into executable EHR database queries.
  • To assess the impact of different documentation modalities (structured text, diagrams) on LLM performance.
  • To identify failure categories and bottlenecks in the automated query generation process.

Main Methods:

  • Two frontier large language models were tested.
  • Five distinct clinical phenotypes were used for evaluation.
  • Three different documentation modalities were employed as input for the models.
  • A detailed error analysis was conducted to categorize model failures.

Main Results:

  • Both evaluated LLMs demonstrated capability in capturing high-level logic from structured text-based documentation.
  • Model performance significantly degraded when presented with diagram-only input.
  • Seven distinct categories of errors were identified during the analysis.
  • The quality and format of documentation emerged as the primary limitation, not the LLM's core capability.

Conclusions:

  • While LLMs show potential for EHR phenotype query generation, their current effectiveness is limited by input modality.
  • Documentation standardization and the inclusion of expert oversight are critical for successful implementation.
  • Future research should focus on improving LLM performance with visual data and standardizing clinical documentation practices.