Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Two-Stage Decoupling Framework for Variable-Length Glaucoma Prognosis.

Learning with longitudinal medical images and data : first International Workshop, LMID 2025, held in conjunction with MICCAI 2025, Daejeon, South Korea, September 27, 2025, Proceedings. International Workshop on Learning with Longitudi...·2026
Same author

BIPEFT: Budget-Guided Iterative Search for Parameter Efficient Fine-Tuning of Large Pretrained Language Models.

Findings of ACL. EMNLP. Conference on Empirical Methods in Natural Language Processing·2025
Same author

Toward precision stroke rehabilitation: an integrated causal machine learning and clinician feedback approach.

JAMIA open·2025
Same author

A Novel Approach for Perceptions of Physician Decision-Making and Latent Topic Refinement in Large Language Model-Enhanced Medical Dialog Generation.

IEEE transactions on neural networks and learning systems·2025
Same author

Bridging Model Heterogeneity in Federated Learning via Uncertainty-based Asymmetrical Reciprocity Learning.

Proceedings of machine learning research·2025
Same author

Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2025
Same journal

AnchorDrug: A system for drug-induced gene expression prediction in new contexts through active learning.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining·2026
Same journal

Domain-Adaptive Continual Meta-Learning for Modeling Dynamical Systems: An Application in Environmental Ecosystems.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining·2025
Same journal

Automated Fusion of Multimodal Electronic Health Records for Better Medical Predictions.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining·2024
Same journal

FAME: Fragment-based Conditional Molecular Generation for Phenotypic Drug Discovery.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining·2022
Same journal

Harmonic Alignment.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining·2021
Same journal

GRIA: Graphical Regularization for Integrative Analysis.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining·2020
See all related articles

Related Experiment Video

Updated: Jun 10, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

647

MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data Augmentation.

Yuan Zhong1, Suhan Cui1, Jiaqi Wang1

  • 1The Pennsylvania State University.

Proceedings of the ... SIAM International Conference on Data Mining. SIAM International Conference on Data Mining
|October 14, 2024
PubMed
Summary
This summary is machine-generated.

MedDiffusion, a novel diffusion-based model, enhances health risk prediction by generating synthetic patient data from Electronic Health Records (EHR). This approach overcomes data insufficiency, improving prediction accuracy and outperforming existing methods.

Keywords:
EHR data augmentationdiffusion modelhealth risk prediction

More Related Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

508
Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack
07:31

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Published on: May 15, 2020

7.0K

Related Experiment Videos

Last Updated: Jun 10, 2025

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

647
Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

508
Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack
07:31

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Published on: May 15, 2020

7.0K

Area of Science:

  • Medical Informatics
  • Artificial Intelligence in Healthcare
  • Machine Learning for Clinical Prediction

Background:

  • Health risk prediction using Electronic Health Records (EHR) is crucial but often hindered by data insufficiency.
  • Existing data augmentation methods struggle with task-unrelated designs, limiting their effectiveness.
  • Novel approaches are needed to generate high-quality synthetic patient data for improved risk prediction.

Purpose of the Study:

  • To introduce MedDiffusion, a novel end-to-end diffusion-based model for health risk prediction.
  • To enhance risk prediction performance by generating synthetic patient data to enlarge the training sample space.
  • To discern hidden relationships between patient visits for high-quality synthetic data generation.

Main Methods:

  • Developed a diffusion-based model (MedDiffusion) for end-to-end health risk prediction.
  • Utilized a step-wise attention mechanism to identify and retain vital information from patient visit sequences.
  • Generated synthetic patient data during training to augment the dataset and improve model generalization.

Main Results:

  • MedDiffusion significantly outperformed 14 baseline models on four real-world medical datasets.
  • Achieved superior performance in Precision-Recall Area Under the Curve (PR-AUC), F1-score, and Cohen's Kappa.
  • Ablation studies and comparisons with Generative Adversarial Network (GAN)-based models validated MedDiffusion's effectiveness and adaptability.

Conclusions:

  • MedDiffusion offers a powerful and adaptable solution for health risk prediction, effectively addressing data insufficiency.
  • The model's ability to discern patient visit relationships enhances synthetic data quality and interpretability.
  • This diffusion-based approach represents a significant advancement in leveraging EHR data for proactive patient care.