Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Model-Independent Approaches for Pharmacokinetic Data: Noncompartmental Analysis

Model-Independent Approaches for Pharmacokinetic Data: Noncompartmental Analysis

Noncompartmental analyses offer an alternative method for describing drug pharmacokinetics without relying on a specific compartmental model. In this approach, the drug's pharmacokinetics are assumed to be linear, with the terminal phase log-linear. This assumption allows for simplified analysis and interpretation of the drug's behavior in the body.
One important characteristic of noncompartmental analyses is that drug exposure increases proportionally with increasing doses. This...

Stratified Sampling Method

Stratified Sampling Method

Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...

Estimating Population Mean with Unknown Standard Deviation

Estimating Population Mean with Unknown Standard Deviation

In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Privacy-preserving verification of preprocessing in federated learning for genomic data.

JAMIA open·2026

Same author

Sustainable Personalized Home Care for Pandemic Management: A Service-Oriented Approach.

Digital government (New York, N.Y.)·2026

Same author

Semantically Correct Policy Mining and Enforcement for Attribute based Access Control.

ACM transactions on Internet technology·2026

Same author

Performance Analysis of Dynamic ABAC Systems using a Queuing Theoretic Framework.

Computers & security·2026

Same author

Privacy-Preserving Verification of ML Preprocessing via Model Behavior Indicators.

IEEE transactions on privacy·2026

Same author

MALITE: Lightweight Malware Detection and Classification for Constrained Devices.

IEEE transactions on emerging topics in computing·2025

Same journal

STORM: Exploiting Spatiotemporal Continuity for Trajectory Similarity Learning in Road Networks.

IEEE transactions on knowledge and data engineering·2026

Same journal

Hierarchical Active Learning with Label Proportions on Data Regions.

IEEE transactions on knowledge and data engineering·2025

Same journal

Data Synthesis Reinvented: Preserving Missing Patterns for Enhanced Analysis.

IEEE transactions on knowledge and data engineering·2025

Same journal

A Neural Database for Answering Aggregate Queries on Incomplete Relational Data.

IEEE transactions on knowledge and data engineering·2024

Same journal

Weakly Supervised Concept Map Generation through Task-Guided Graph Translation.

IEEE transactions on knowledge and data engineering·2024

Same journal

HyperMinHash: MinHash in LogLog space.

IEEE transactions on knowledge and data engineering·2024

See all related articles

Search research articles

Related Experiment Video

Updated: May 9, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Cafe: Improved Federated Data Imputation by Leveraging Missing Data Heterogeneity.

Sitao Min¹, Hafiz Asif², Xinyue Wang¹

¹Rutgers University, Newark, NJ, USA.

IEEE Transactions on Knowledge and Data Engineering

|May 5, 2025

Summary

This summary is machine-generated.

Federated learning (FL) addresses missing data challenges with Cafe, a personalized approach. Cafe improves imputation quality, especially in heterogeneous data settings, outperforming existing methods.

Keywords:

Data Heterogeneity Data Quality Federated Learning Missing Data Imputation

More Related Videos

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Published on: November 17, 2023

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Published on: October 25, 2011

Related Experiment Videos

Last Updated: May 9, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Published on: November 17, 2023

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Published on: October 25, 2011

Area of Science:

Machine Learning
Data Science
Decentralized Systems

Background:

Federated learning (FL) is a decentralized approach that enhances performance while preserving data autonomy and confidentiality.
Handling missing values in FL is an underexplored area, particularly with heterogeneous data distributions across clients.
Current state-of-the-art (SOTA) federated imputation methods exhibit significant performance degradation in heterogeneous settings.

Purpose of the Study:

To investigate federated imputation methods for missing data, focusing on complex scenarios with data heterogeneity.
To address the limitations of existing SOTA approaches in maintaining imputation quality under data heterogeneity.
To propose a novel personalized federated learning approach for improved missing data imputation.

Main Methods:

Introducing Cafe, a personalized federated learning approach for missing data imputation.
Leveraging observed and missing data distribution differences across clients to enhance imputation quality.
Developing personalized imputation models by computing automatically calibrated weights for varying levels of heterogeneity.

Main Results:

Cafe matches SOTA baseline performance in homogeneous data settings.
Cafe significantly outperforms SOTA baselines in heterogeneous data settings.
Empirical evaluations confirm Cafe's effectiveness across diverse settings.

Conclusions:

Cafe offers an effective solution for federated missing data imputation, particularly in challenging heterogeneous environments.
Personalized weighting strategies in Cafe adapt to data heterogeneity, improving imputation accuracy.
The proposed approach advances the field of federated learning by addressing critical data quality issues.