Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Methods of Documentation VI: Case Management Model01:15

Methods of Documentation VI: Case Management Model

The case management model is a multidisciplinary approach that involves healthcare professionals from diverse disciplines, such as physicians, nurses, therapists, social workers, and pharmacists, working collaboratively to address the various needs of patients. Each healthcare professional brings unique expertise and perspectives, contributing to a more comprehensive understanding of the patient's condition and tailoring treatment plans accordingly.
For example, a patient with a chronic illness...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Vision transformer autoencoders captures local and non-local features in brain imaging to reveal novel genetic associations.

Communications biology·2026
Same author

Replicability of unsupervised deep learning derived image phenotypes.

bioRxiv : the preprint server for biology·2026
Same author

Genetic architecture of white matter microstructure captured by unsupervised deep representation learning of fractional anisotropy maps.

Nature communications·2026
Same author

Improving Vancomycin Therapeutic Drug Monitoring With a Deep Learning-Based Two-Compartment Predictive Model: Development and Validation Study.

JMIR AI·2026
Same author

HiFiMAP: High-resolution fast identity-by-descent mapping test.

medRxiv : the preprint server for health sciences·2026
Same author

Haplotype-based Parallel PBWT for Biobank Scale Data.

IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences·2026
Same journal

Predicting Chemotherapy Response from Staging Laparoscopy Images.

medRxiv : the preprint server for health sciences·2026
Same journal

Development and External Validation of a Machine Learning Model for 10-Year Ischemic Stroke Risk Prediction in Diverse Populations.

medRxiv : the preprint server for health sciences·2026
Same journal

MCH-Guard: Multimodal Machine Learning Framework for Risk Stratification of Cerebral Microhemorrhage Risk in the Alzheimer's Disease Neuroimaging Initiative.

medRxiv : the preprint server for health sciences·2026
Same journal

Genetic and maternal environmental contributions to estimated fetal weight at 20 weeks gestation compared with birthweight.

medRxiv : the preprint server for health sciences·2026
Same journal

Better immediate declarative memory is associated with forgetting during locomotor adaptation in chronic stroke and in older adults.

medRxiv : the preprint server for health sciences·2026
Same journal

An empirical Bayes framework for burden and dispersion association tests helps prioritize rare variants associated with Alzheimer's disease.

medRxiv : the preprint server for health sciences·2026
See all related articles

Related Experiment Video

Updated: May 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Disease Risk Prediction Using Structured EHR Data: Can Generalist Large Language Models Match Specialized Clinical

Bingyu Mao1, Made K Prasadha1, Ziqian Xie1

  • 1McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.

Medrxiv : the Preprint Server for Health Sciences
|May 7, 2026
PubMed
Summary
This summary is machine-generated.

LLM-generated embeddings with simple classifiers surpassed specialized clinical foundation models (CFMs) and generalist large language models (LLMs) in disease risk prediction, offering a promising, cost-effective approach.

Keywords:
Clinical Foundation ModelsDisease Risk PredictionElectronic Health RecordsLarge Language Models

More Related Videos

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack
07:31

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Published on: May 15, 2020

Related Experiment Videos

Last Updated: May 8, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack
07:31

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Published on: May 15, 2020

Area of Science:

  • Artificial Intelligence in Healthcare
  • Clinical Informatics
  • Biomedical Data Science

Background:

  • Electronic health records (EHRs) are widely used with clinical decision support tools.
  • Clinical foundation models (CFMs) excel in predictive tasks using structured EHR data.
  • Generalist large language models (LLMs) are increasingly applied to healthcare, but their efficacy against specialized CFMs for disease prediction is unclear.

Purpose of the Study:

  • To compare the performance of CFMs against fine-tuned generalist LLMs and LLM-generated embeddings for disease risk prediction.
  • To evaluate model performance on diverse datasets including multi-site EHR, claims data, and an open-source benchmark.
  • To determine the optimal approach for leveraging AI in structured clinical data for predictive tasks.

Main Methods:

  • Compared specialized CFMs (Med-BERT, CLMBR) with fine-tuned generalist LLMs (Mistral, LLaMA-2/3/3.1) and a clinical LLM (Me-LLaMA).
  • Evaluated LLM-generated embeddings combined with simple classifiers (logistic regression, MLP) using models like DeepSeek, Qwen3, and GPT-OSS.
  • Assessed performance on heart failure risk (DHF) and pancreatic cancer diagnosis (PaCa) using AUROC and AUPRC metrics across multiple data sources.

Main Results:

  • Fine-tuned CFMs showed a small, statistically significant advantage over fine-tuned LLMs on larger EHR and claims datasets (<1% AUROC).
  • LLM-generated embeddings with lightweight classifiers achieved superior AUROC (>90%) and AUPRC (66%) compared to both fine-tuned CFMs and LLMs.
  • On the PaCa cohort, LLMs had higher AUROCs, but CFMs achieved significantly higher AUPRC.

Conclusions:

  • LLM-generated embeddings with simple classifiers represent a highly effective strategy for disease risk prediction, outperforming fine-tuned specialized and generalist models.
  • While generalist LLMs show potential, their computational cost and variable performance require careful consideration.
  • The study provides a reproducible framework for evaluating AI models in clinical settings, highlighting the efficacy of embedding-based approaches.