Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

8.5K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
8.5K
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

95
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
95
Model-Independent Approaches for Pharmacokinetic Data: Noncompartmental Analysis00:59

Model-Independent Approaches for Pharmacokinetic Data: Noncompartmental Analysis

19
Noncompartmental analyses offer an alternative method for describing drug pharmacokinetics without relying on a specific compartmental model. In this approach, the drug's pharmacokinetics are assumed to be linear, with the terminal phase log-linear. This assumption allows for simplified analysis and interpretation of the drug's behavior in the body.
One important characteristic of noncompartmental analyses is that drug exposure increases proportionally with increasing doses. This...
19
Stratified Sampling Method01:16

Stratified Sampling Method

11.6K
Sampling is a technique to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. The sampling method ensures that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a stratified sample, divide the population into groups called strata and then take a...
11.6K
Model Approaches for Pharmacokinetic Data: Distributed Parameter Models01:06

Model Approaches for Pharmacokinetic Data: Distributed Parameter Models

43
Pharmacokinetic models are mathematical constructs that represent and predict the time course of drug concentrations in the body, providing meaningful pharmacokinetic parameters. These models are categorized into compartment, physiological, and distributed parameter models.
The distributed parameter models are specifically designed to account for variations and differences in some drug classes. This model is particularly useful for assessing regional concentrations of anticancer or...
43
Estimating Population Mean with Unknown Standard Deviation01:22

Estimating Population Mean with Unknown Standard Deviation

7.5K
In practice, we rarely know the population standard deviation. In the past, when the sample size was large, this did not present a problem to statisticians. They used the sample standard deviation s as an estimate for σ and proceeded as before to calculate a confidence interval with close enough results. However, statisticians ran into problems when the sample size was small. A small sample size caused inaccuracies in the confidence interval.
William S. Gosset (1876–1937) of the...
7.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Privacy-preserving verification of preprocessing in federated learning for genomic data.

JAMIA open·2026
Same author

Sustainable Personalized Home Care for Pandemic Management: A Service-Oriented Approach.

Digital government (New York, N.Y.)·2026
Same author

Semantically Correct Policy Mining and Enforcement for Attribute based Access Control.

ACM transactions on Internet technology·2026
Same author

Performance Analysis of Dynamic ABAC Systems using a Queuing Theoretic Framework.

Computers & security·2026
Same author

Privacy-Preserving Verification of ML Preprocessing via Model Behavior Indicators.

IEEE transactions on privacy·2026
Same author

MALITE: Lightweight Malware Detection and Classification for Constrained Devices.

IEEE transactions on emerging topics in computing·2025
Same journal

STORM: Exploiting Spatiotemporal Continuity for Trajectory Similarity Learning in Road Networks.

IEEE transactions on knowledge and data engineering·2026
Same journal

Hierarchical Active Learning with Label Proportions on Data Regions.

IEEE transactions on knowledge and data engineering·2025
Same journal

Data Synthesis Reinvented: Preserving Missing Patterns for Enhanced Analysis.

IEEE transactions on knowledge and data engineering·2025
Same journal

A Neural Database for Answering Aggregate Queries on Incomplete Relational Data.

IEEE transactions on knowledge and data engineering·2024
Same journal

Weakly Supervised Concept Map Generation through Task-Guided Graph Translation.

IEEE transactions on knowledge and data engineering·2024
Same journal

HyperMinHash: MinHash in LogLog space.

IEEE transactions on knowledge and data engineering·2024
See all related articles

Related Experiment Video

Updated: May 9, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.3K

Cafe: Improved Federated Data Imputation by Leveraging Missing Data Heterogeneity.

Sitao Min1, Hafiz Asif2, Xinyue Wang1

  • 1Rutgers University, Newark, NJ, USA.

IEEE Transactions on Knowledge and Data Engineering
|May 5, 2025
PubMed
Summary
This summary is machine-generated.

Federated learning (FL) addresses missing data challenges with Cafe, a personalized approach. Cafe improves imputation quality, especially in heterogeneous data settings, outperforming existing methods.

Keywords:
Data HeterogeneityData QualityFederated LearningMissing Data Imputation

More Related Videos

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking
07:34

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Published on: November 17, 2023

584
Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence
07:54

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Published on: October 25, 2011

18.6K

Related Experiment Videos

Last Updated: May 9, 2025

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index
06:55

Inverse Probability of Treatment Weighting Propensity Score using the Military Health System Data Repository and National Death Index

Published on: January 8, 2020

14.3K
Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking
07:34

Author Spotlight: Advancing Prostate Cancer Research Through Improved Tissue Sampling and Biobanking

Published on: November 17, 2023

584
Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence
07:54

Heterogeneity Mapping of Protein Expression in Tumors using Quantitative Immunofluorescence

Published on: October 25, 2011

18.6K

Area of Science:

  • Machine Learning
  • Data Science
  • Decentralized Systems

Background:

  • Federated learning (FL) is a decentralized approach that enhances performance while preserving data autonomy and confidentiality.
  • Handling missing values in FL is an underexplored area, particularly with heterogeneous data distributions across clients.
  • Current state-of-the-art (SOTA) federated imputation methods exhibit significant performance degradation in heterogeneous settings.

Purpose of the Study:

  • To investigate federated imputation methods for missing data, focusing on complex scenarios with data heterogeneity.
  • To address the limitations of existing SOTA approaches in maintaining imputation quality under data heterogeneity.
  • To propose a novel personalized federated learning approach for improved missing data imputation.

Main Methods:

  • Introducing Cafe, a personalized federated learning approach for missing data imputation.
  • Leveraging observed and missing data distribution differences across clients to enhance imputation quality.
  • Developing personalized imputation models by computing automatically calibrated weights for varying levels of heterogeneity.

Main Results:

  • Cafe matches SOTA baseline performance in homogeneous data settings.
  • Cafe significantly outperforms SOTA baselines in heterogeneous data settings.
  • Empirical evaluations confirm Cafe's effectiveness across diverse settings.

Conclusions:

  • Cafe offers an effective solution for federated missing data imputation, particularly in challenging heterogeneous environments.
  • Personalized weighting strategies in Cafe adapt to data heterogeneity, improving imputation accuracy.
  • The proposed approach advances the field of federated learning by addressing critical data quality issues.