Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Longitudinal Research02:20

Longitudinal Research

12.0K
Sometimes we want to see how people change over time, as in studies of human development and lifespan. When we test the same group of individuals repeatedly over an extended period of time, we are conducting longitudinal research. Longitudinal research is a research design in which data-gathering is administered repeatedly over an extended period of time. For example, we may survey a group of individuals about their dietary habits at age 20, retest them a decade later at age 30, and then again...
12.0K
Longitudinal Studies01:26

Longitudinal Studies

194
Longitudinal studies are also widely used in other medical and social science fields. For instance, in cardiovascular research, they can monitor patients' health over decades to identify risk factors for heart disease, such as high cholesterol or smoking, and evaluate the long-term effectiveness of preventive measures. Similarly, in mental health studies, researchers might follow individuals from adolescence into adulthood to understand the development and progression of conditions like...
194
Introduction To Survival Analysis01:18

Introduction To Survival Analysis

314
Survival analysis is a statistical method used to study time-to-event data, where the "event" might represent outcomes like death, disease relapse, system failure, or recovery. A unique feature of survival data is censoring, which occurs when the event of interest has not been observed for some individuals during the study period. This requires specialized techniques to handle incomplete data effectively.
The primary goal of survival analysis is to estimate survival time—the time...
314
Statistical Methods for Analyzing Epidemiological Data01:25

Statistical Methods for Analyzing Epidemiological Data

441
Epidemiological data primarily involves information on specific populations' occurrence, distribution, and determinants of health and diseases. This data is crucial for understanding disease patterns and impacts, aiding public health decision-making and disease prevention strategies. The analysis of epidemiological data employs various statistical methods to interpret health-related data effectively. Here are some commonly used methods:
441
Purpose of Health Records II01:19

Purpose of Health Records II

998
Health records serve various essential purposes in the healthcare system. Here are some key purposes:
998
Synthetic Biology02:55

Synthetic Biology

4.9K
Synthetic biology is an interdisciplinary science that involves using principles from disciplines such as engineering, molecular biology, cell biology, and systems biology. It involves remodeling existing organisms from nature or constructing completely new synthetic organisms for applications such as protein or enzyme production, bioremediation, value-added macromolecule production, and the addition of desirable traits to crops, to name a few.
Golden rice
Golden rice is a genetically modified...
4.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Prediction models for mortality in patients with acute on chronic liver failure: systematic review and critical appraisal.

Frontiers in medicine·2026
Same author

Type I unconventional protein secretion of the SARS-CoV-2 nucleocapsid protein promotes inflammatory cytokine release.

Cell reports·2026
Same author

Light Intensity-Driven Bidirectional Photoresponse Vision Sensor for Autonomous Obstacle Avoidance System.

Advanced materials (Deerfield Beach, Fla.)·2026
Same author

Valorization of the cauliflower mushroom (<i>Sparassis latifolia</i>) pseudosclerotia via ultrasound-assisted extraction of polysaccharides: Optimization, characterization and bioactivity evaluation.

Food chemistry: X·2026
Same author

Superior Benzene Catalytic Oxidation over Co<sub>3</sub>O<sub>4</sub> Catalysts with Oxygen Vacancy-Rich Co Sites.

Langmuir : the ACS journal of surfaces and colloids·2026
Same author

Regularized Tensor Quantile Regression With Applications to Neuroimaging Data Analysis.

Statistics in medicine·2026
Same journal

Methods for incorporating test result information within the high-dimensional propensity score framework: application in UK electronic health record data.

BMC medical research methodology·2026
Same journal

Sparse multi-way DMDC for longitudinal classification in high dimension low sample size data.

BMC medical research methodology·2026
Same journal

Tree-based exploratory identification of predictive biomarkers in non-randomized data.

BMC medical research methodology·2026
Same journal

Comparative evaluation of interrupted time series analytical methods for healthcare quality improvement research: a Monte Carlo simulation study.

BMC medical research methodology·2026
Same journal

Methodological advances in claims-based dementia algorithms: integrating medication and clinical data for medicare populations.

BMC medical research methodology·2026
Same journal

An interpretable XGboost algorithm for predicting 30-day mortality in acute pancreatitis using routine biomarkers.

BMC medical research methodology·2026
See all related articles

Related Experiment Video

Updated: Aug 5, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.5K

A method for generating synthetic longitudinal health data.

Lucy Mosquera1,2, Khaled El Emam3,4,5, Lei Ding6

  • 1Replica Analytics Ltd, Ottawa, ON, Canada.

BMC Medical Research Methodology
|March 24, 2023
PubMed
Summary
This summary is machine-generated.

Generating synthetic health data using deep learning offers a privacy-preserving alternative for research. This method successfully mimics real data patterns and analytical results while significantly reducing privacy risks.

Keywords:
Administrative health dataData privacyData sharingSynthetic data

More Related Videos

Methodology for Establishing a Community-Wide Life Laboratory for Capturing Unobtrusive and Continuous Remote Activity and Health Data
11:21

Methodology for Establishing a Community-Wide Life Laboratory for Capturing Unobtrusive and Continuous Remote Activity and Health Data

Published on: July 27, 2018

8.3K
A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

10.7K

Related Experiment Videos

Last Updated: Aug 5, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.5K
Methodology for Establishing a Community-Wide Life Laboratory for Capturing Unobtrusive and Continuous Remote Activity and Health Data
11:21

Methodology for Establishing a Community-Wide Life Laboratory for Capturing Unobtrusive and Continuous Remote Activity and Health Data

Published on: July 27, 2018

8.3K
A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data
10:46

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

10.7K

Area of Science:

  • Health Informatics
  • Data Science
  • Biostatistics

Background:

  • Accessing administrative health data for research is challenging due to strict privacy regulations.
  • Synthetic datasets, which mimic real data patterns without containing identifiable information, offer a potential solution.

Purpose of the Study:

  • To assess the feasibility of generating synthetic administrative health data using a recurrent deep learning model.
  • To evaluate the utility and privacy risks of the generated synthetic data.

Main Methods:

  • A recurrent deep learning model was used to generate synthetic data from a large administrative health database (120,000 individuals).
  • Utility was assessed using distribution comparisons (Hellinger distance) for various data attributes and by replicating a Cox regression analysis.
  • Privacy risks were evaluated by assessing attribution disclosure risk.

Main Results:

  • Utility assessments showed small differences between real and synthetic data distributions (e.g., Hellinger distance for joint distributions: 0.352).
  • Cox regression analysis on synthetic data yielded comparable results to real data, with a 68% mean confidence interval overlap for hazard ratios.
  • Privacy assessment indicated attribution disclosure risk was substantially below the acceptable threshold (0.09).

Conclusions:

  • The generated synthetic administrative health data demonstrates high utility and is sufficiently similar to the real data for research purposes.
  • This synthetic data approach can help overcome barriers to accessing sensitive health information, facilitating broader research.
  • The method provides a viable, privacy-preserving alternative for sharing health data in specific research contexts.