Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Search research articles

Related Experiment Videos

Protecting privacy using k-anonymity.

Khaled El Emam¹, Fida Kamal Dankar

¹Children's Hospital of Eastern Ontario Research Institute, Ottawa, Ontario K1J 8L1, Canada. kelemam@uottawa.ca

Journal of the American Medical Informatics Association : JAMIA

|June 27, 2008

Summary

This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Sample size calculation for training ensemble machine learning models on health data.

Patterns (New York, N.Y.)·2026

Same author

An Evaluation of Pretrained Generative Models for Augmenting Small Health Data: Comparative Modeling Study.

Journal of medical Internet research·2026

Same author

AI for predicting exacerbations in KIDs with asthma (AIRE-KIDS).

NPJ digital medicine·2026

Same author

Transfer Learning and Machine Learning for Training Five-Year Survival Prognostic Models in Early Breast Cancer: Development and Validation Study.

Journal of medical Internet research·2026

Same author

Prognostic and predictive performance of PREDICT 2.1, PREDICT v3, and RSClin in node-negative early breast cancer: a TEAM pathology substudy.

Breast cancer research and treatment·2026

Same author

Crossing borders securely: synthetic data and federated networks for privacy-preserving access to real-world data and emerging use cases.

NPJ digital medicine·2025

Same journal

Extending the fundamental theorem of biomedical informatics: a proposal and illustrative examples.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Human factors methods for designing safe health information technology: what do the experts think?

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Equity-by-design for socially assistive robots as digital health tools.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Orchestrator multi-agent clinical decision support system for secondary headache diagnosis in primary care.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

CUI-Curate: a GraphRAG-based framework for automated clinical concept curation for NLP applications.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Malfunctions in distributed clinical decision support: 3 cases from a multi‑component clinical decision support system.

Journal of the American Medical Informatics Association : JAMIA·2026

See all related articles

K-anonymity for health data anonymization can lead to over-anonymization and high information loss. A hypothesis testing approach offers better re-identification risk control and less data distortion.

Area of Science:

Health Informatics
Data Privacy
Computer Science

Background:

Increasing pressure to share health information necessitates robust anonymization techniques.
K-anonymity is a popular method, but its actual re-identification risk remains unevaluated.
Privacy concerns are paramount when disclosing personal health information.

Purpose of the Study:

To evaluate the re-identification risk of k-anonymity and its improvements.
To compare information loss across different anonymization methods.
To provide guidelines for optimal anonymization strategies.

Main Methods:

Simulation-based evaluation of re-identification probability.
Assessment of k-anonymity and three improved methods.

Related Experiment Videos

Measurement of information loss using the discernability metric.

Main Results:

K-anonymity often over-anonymizes data, particularly with small sampling fractions, leading to significant information loss.
The hypothesis testing approach demonstrated superior control over re-identification risk.
The hypothesis testing approach resulted in less information loss compared to baseline k-anonymity.

Conclusions:

The hypothesis testing approach is recommended over baseline k-anonymity for certain scenarios.
Guidelines are established for choosing between hypothesis testing and k-anonymity.
Balancing privacy protection and data utility is crucial in health information sharing.