Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...

Censoring Survival Data

Censoring Survival Data

Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...

Data Collection by Observations

Data Collection by Observations

Data collection refers to a systematic way of obtaining, observing, measuring, and analyzing accurate information. Observational studies are one of the most widely used methods of data collection. It involves collecting data by observing the behavior and physical characteristics of a sample without making any modifications to the sample.
An astronomer viewing the motion and brightness of stars in the sky and recording the data is an example of observational data collection. A botanist recording...

Collisions in Multiple Dimensions: Problem Solving

Collisions in Multiple Dimensions: Problem Solving

In multiple dimensions, the conservation of momentum applies in each direction independently. Hence, to solve collisions in multiple dimensions, we should write down the momentum conservation in each direction separately. To help understand collisions in multiple dimensions, consider an example.
A small car of mass 1,200 kg traveling east at 60 km/h collides at an intersection with a truck of mass 3,000 kg traveling due north at 40 km/h. The two vehicles are locked together. What is the...

Secondary Distribution

Secondary Distribution

Secondary distribution systems provide electrical energy at the utilization voltage levels from distribution transformers to customer meters. Typical secondary voltages in the United States include 120/240 V for residential use, 208Y/120 V for residential and commercial use, and 480Y/277 V for industrial and high-rise commercial use.
In residential areas, 120/240 V single-phase, three-wire service is commonly used for lighting, outlets, and large appliances. Urban areas with high-density loads...

Data: Types and Distribution

Data: Types and Distribution

In biostatistics, data are the observations collected for analysis. There are two main types: parametric and non-parametric. Parametric data, which include continuous (e.g., weight) and discrete numerical data (e.g., number of tablets), assume a particular distribution pattern, often the normal distribution. Non-parametric data do not adhere to a specific distribution and typically comprise nominal (e.g., gender) and ordinal categorical data (e.g., pain scale ratings).
Distributions in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A supervised Bayesian method for time (re)annotation of transcriptomics data.

NAR genomics and bioinformatics·2026

Same author

Misspecification-robust likelihood-free inference in high dimensions.

Computational statistics·2025

Same author

Towards modeling evolving longitudinal health trajectories with a transformer-based deep learning model.

Annals of epidemiology·2025

Same author

VitroBert: modeling DILI by pretraining BERT on in vitro data.

Journal of cheminformatics·2025

Same author

E-GuARD: expert-guided augmentation for the robust detection of compounds interfering with biological assays.

Journal of cheminformatics·2025

Same author

Molecular property prediction using pretrained-BERT and Bayesian active learning: a data-efficient approach to drug design.

Journal of cheminformatics·2025

Same journal

Risk prediction of sepsis-associated acute kidney injury: development, validation of a machine learning model with multicenter data.

BMC medical informatics and decision making·2026

Same journal

Trajectory analysis of sleep disorders and anxiety-depression in female breast cancer patients undergoing chemotherapy: based on group-based Multi-Trajectory Model and machine learning.

BMC medical informatics and decision making·2026

Same journal

Multitask learning of longitudinal circulating biomarkers and clinical outcomes: identification of optimal machine-learning and deep-learning models.

BMC medical informatics and decision making·2026

Same journal

Comparative machine learning approaches to prognosticate clinical outcomes in oral and maxillofacial space infections: a retrospective analysis.

BMC medical informatics and decision making·2026

Same journal

Development and validation of machine learning models for early diagnosis of hemophagocytic lymphohistiocytosis in pediatric Epstein-Barr virus infection.

BMC medical informatics and decision making·2026

Same journal

Clinical subphenotypes in septic patients with new-onset atrial fibrillation: validation and parsimonious classifier model development.

BMC medical informatics and decision making·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 23, 2025

Project-Based Learning Guidelines for Health Sciences Students: An Analysis with Data Mining and Qualitative Techniques

Project-Based Learning Guidelines for Health Sciences Students: An Analysis with Data Mining and Qualitative Techniques

Published on: December 9, 2022

Collaborative learning from distributed data with differentially private synthetic data.

Lukas Prediger¹, Joonas Jälkö^2,3, Antti Honkela³

¹Aalto University, Espoo, 00076, Finland. lukas.m.prediger@aalto.fi.

BMC Medical Informatics and Decision Making

|June 14, 2024

Summary

This summary is machine-generated.

Sharing privacy-preserving synthetic data enables collaborative learning for multiple parties. This approach improves statistical accuracy, especially for small or underrepresented datasets, overcoming privacy barriers in biomedical research.

Keywords:

Collaborative learning Differential privacy Health informatics Synthetic data

More Related Videos

Author Spotlight: Automated Deep Brain Stimulation for Parkinson's Disease - Exploring the Possibilities and Challenges of Home Monitoring

Author Spotlight: Automated Deep Brain Stimulation for Parkinson's Disease - Exploring the Possibilities and Challenges of Home Monitoring

Published on: July 14, 2023

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

Published on: May 27, 2021

Related Experiment Videos

Last Updated: Jun 23, 2025

Project-Based Learning Guidelines for Health Sciences Students: An Analysis with Data Mining and Qualitative Techniques

Project-Based Learning Guidelines for Health Sciences Students: An Analysis with Data Mining and Qualitative Techniques

Published on: December 9, 2022

Author Spotlight: Automated Deep Brain Stimulation for Parkinson's Disease - Exploring the Possibilities and Challenges of Home Monitoring

Author Spotlight: Automated Deep Brain Stimulation for Parkinson's Disease - Exploring the Possibilities and Challenges of Home Monitoring

Published on: July 14, 2023

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

A Data Integration Workflow to Identify Drug Combinations Targeting Synthetic Lethal Interactions

Published on: May 27, 2021

Area of Science:

Health Informatics
Biomedical Research
Data Privacy

Background:

Collaborative learning is hindered by privacy concerns and the inability to pool sensitive data.
Decentralized computation without central coordination poses challenges for joint analysis.
This study explores using privacy-preserving synthetic data for collaborative learning on UK Biobank data.

Purpose of the Study:

To evaluate the feasibility of combining synthetic data for collaborative learning.
To assess the impact of data size, number of parties, and distribution shifts on learning outcomes.
To determine if synthetic data sharing can overcome privacy and data access limitations in research.

Main Methods:

Simulated multiple parties by splitting the UK Biobank cohort.
Generated differentially private synthetic data for each simulated party.
Applied Poisson regression analysis on combined synthetic data and compared with local data analysis.

Main Results:

Collaborative learning with synthetic data yielded more accurate regression parameter estimates than using local data alone.
Improvements were observed even with small, heterogeneous datasets.
Increased participation of parties led to greater and more consistent improvements, up to a point.
Synthetic data sharing particularly benefited analysis for underrepresented groups.

Conclusions:

Sharing synthetic data is a viable strategy for privacy-preserving collaborative learning.
This method enables learning from sensitive data without compromising privacy, even with limited or non-representative local datasets.
Privacy-preserving collaborative learning methods can alleviate bottlenecks caused by inaccessible distributed sensitive data in biomedical research.