Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network
View abstract on PubMed
Summary
This summary is machine-generated.This study synthesizes medical data using real-world time-series generative adversarial networks (RTSGAN), ensuring patient privacy. The generated synthetic data accurately reflects real patient information and maintains predictive performance for colorectal cancer survival rates.
Area Of Science
- Medical Informatics
- Artificial Intelligence
- Data Science
Background
- Privacy concerns hinder the use of sensitive medical data.
- Developing methods for secure utilization of patient information is crucial.
- Real-world time-series generative adversarial networks (RTSGAN) offer a potential solution for data synthesis.
Purpose Of The Study
- To synthesize high-quality, privacy-preserving medical data using RTSGAN.
- To evaluate the fidelity and utility of synthesized time-series medical data.
- To assess the privacy risks associated with the synthetic data generation process.
Main Methods
- Utilized RTSGAN to synthesize 53,005 data points from 15,799 colorectal cancer patients.
- Performed quantitative evaluations including Hellinger distance, TSTR/TRTS AUC, and propensity mean squared error.
- Conducted qualitative analyses (t-SNE, histogram) and privacy assessments (distance to closest records, membership inference test).
Main Results
- Quantitative metrics demonstrated high data quality (Hellinger distance 0-0.25, AUC 0.98-0.99, PMSE 0.223).
- Qualitative analyses confirmed similarity between synthetic and real data.
- Synthetic data achieved comparable performance to real data in predicting colorectal cancer survival.
- Privacy assessments indicated minimal risk of data exposure.
Conclusions
- RTSGAN is effective for synthesizing realistic and privacy-preserving medical time-series data.
- Synthesized data accurately represents real-world patient characteristics.
- The approach enables secure utilization of medical data for AI model development and clinical insights.

