Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Multimodal Training to Unimodal Deployment: Leveraging Unstructured Data During Training to Optimize Structured Data

Zigui Wang1, Minghui Sun1, Jiang Shu1

  • 1Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA.

AMIA Joint Summits on Translational Science Proceedings. AMIA Joint Summits on Translational Science
|June 19, 2026
PubMed
Summary

Related Concept Videos

Survival Tree01:19

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a survival tree begins...
Multicompartment Models: Overview01:14

Multicompartment Models: Overview

Multicompartment models are mathematical constructs that depict how drugs are distributed and eliminated within the body. They segment the body into several compartments, symbolizing various physiological or anatomical areas connected through drug transfer processes such as absorption, metabolism, distribution, and elimination.
These models offer a more comprehensive representation of drug behavior in the body than one-compartment models. They accommodate the complexity of drug distribution,...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The association of oral health with anxiety symptoms among older adults in China: a cross-sectional study.

BMC geriatrics·2026
Same author

Functionalized extracellular vesicles for enhanced brain targeted delivery of luteolin as a novel anti-neuroinflammatory therapy.

International journal of pharmaceutics·2026
Same author

Tele-delivered caregiver coaching for autism in South Africa - A mixed-methods study of acceptability, appropriateness and feasibility.

Digital health·2026
Same author

Using clinical notes to identify children with speech-language delay and understand differences in diagnostic timing.

JAMIA open·2026
Same author

Applications of Gene-Editing Technologies in Enhancing Crop Stress Resistance with Emphasis on Rice.

Plants (Basel, Switzerland)·2026
Same author

Comparative effectiveness of antidepressants for depression using EHRs from two health systems.

BMC psychiatry·2026
Same journal

LabSage: Structural-Semantic Decoupling for Enhanced Retrieval-Augmented Generation in Clinical Laboratories.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

Evaluating Representation Embeddings from LLMs and Time-Series Foundation Models for Wearable Accelerometer-Based Health Prediction.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

Mapping the Storm: Linking Tornado Paths to Emergency Room Surges Through Geocoded Patient Data.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

Multi-Modal Deep Learning-Based Model to Predict Burkitt Lymphoma Recurrence.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

A Multi-Model LLM Consensus Framework to Identify EHR-Predictable Eligibility Criteria in NSCLC Immunotherapy Trials.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
See all related articles
This summary is machine-generated.

This study introduces a multimodal learning framework to improve clinical predictions using unstructured Electronic Health Record (EHR) data during training. The model, trained with notes and structured data, can be deployed using only structured EHR data, enhancing diagnostic accuracy.

Area of Science:

  • Medical Informatics
  • Machine Learning in Healthcare
  • Electronic Health Records

Background:

  • Unstructured Electronic Health Record (EHR) data, like clinical notes, offer valuable contextual information often missed in structured data.
  • Integrating unstructured EHR data into predictive models can significantly improve performance but presents deployment challenges.
  • Existing models often rely solely on structured EHR data, limiting their predictive power.

Purpose of the Study:

  • To develop a multimodal learning framework that utilizes unstructured EHR data during training.
  • To create a model deployable using only structured EHR data, retaining the benefits of unstructured information.
  • To enhance the performance of phenotype models by effectively incorporating diverse EHR data types.

Main Methods:

Related Experiment Videos

  • A multimodal learning framework was designed, combining unstructured (clinical notes) and structured (demographics, medical codes) EHR data.
  • BioClinicalBERT was used for generating note embeddings, and structured embeddings were encoded.
  • A teacher-student model approach with contrastive learning and knowledge distillation was employed for joint training.

Main Results:

  • The proposed model achieved an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.705, outperforming the structured-only baseline (AUROC = 0.656).
  • A teacher model trained on notes achieved a high AUROC of 0.985.
  • The framework successfully enabled the deployment of a high-performing model using only structured EHR data.

Conclusions:

  • Leveraging unstructured EHR data during training significantly enhances the model's ability to extract relevant information from structured EHR data.
  • The developed multimodal framework allows for the creation of deployable, structured-only phenotype models with improved accuracy.
  • This approach offers a practical solution for integrating rich clinical insights from unstructured notes into machine learning models for healthcare.