Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Improving Retrieval-Augmented Generation without Taxonomy-based Error Categorization.

Proceedings of the conference. Association for Computational Linguistics. Meeting·2026
Same author

Orchestrator multi-agent clinical decision support system for secondary headache diagnosis in primary care.

Journal of the American Medical Informatics Association : JAMIA·2026
Same author

Mesh-represented and learning-empowered hologram synthesis for full 3D holographic displays.

Nature communications·2026
Same author

EventTracer: Fast Path Tracing-based Event Stream Rendering.

IEEE transactions on visualization and computer graphics·2026
Same author

A multi-agent large language model framework to automatically assess performance of a clinical AI Triage tool.

npj health systems·2026
Same author

A universal foundation model for grounded biomedical image interpretation.

Nature communications·2026
Same journal

LabSage: Structural-Semantic Decoupling for Enhanced Retrieval-Augmented Generation in Clinical Laboratories.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

Evaluating Representation Embeddings from LLMs and Time-Series Foundation Models for Wearable Accelerometer-Based Health Prediction.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

ClinNoteAgents: An LLM Multi-Agent System for Predicting and Interpreting Heart Failure 30-Day Readmission from Clinical Notes.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

Mapping the Storm: Linking Tornado Paths to Emergency Room Surges Through Geocoded Patient Data.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

Multi-Modal Deep Learning-Based Model to Predict Burkitt Lymphoma Recurrence.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
Same journal

A Multi-Model LLM Consensus Framework to Identify EHR-Predictable Eligibility Criteria in NSCLC Immunotherapy Trials.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science·2026
See all related articles

Related Experiment Video

Updated: Jun 14, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

519

DIRI: Adversarial Patient Reidentification with Large Language Models for Evaluating Clinical Text Anonymization.

John X Morris1, Thomas R Campion1, Sri Laasya Nutheti1

  • 1Cornell Tech, New York, NY.

AMIA Joint Summits on Translational Science Proceedings. AMIA Joint Summits on Translational Science
|June 12, 2025
PubMed
Summary
This summary is machine-generated.

Current deidentification methods fail to fully protect patient privacy in clinical notes. An adversarial large language model (LLM) approach successfully re-identified 9% of notes, revealing weaknesses in existing tools.

More Related Videos

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

2.4K
A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts
07:50

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

15.9K

Related Experiment Videos

Last Updated: Jun 14, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

519
Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

2.4K
A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts
07:50

A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts

Published on: September 20, 2018

15.9K

Area of Science:

  • Biomedical Informatics
  • Natural Language Processing
  • Data Privacy

Background:

  • Sharing protected health information (PHI) is vital for biomedical research.
  • Deidentification is crucial for removing PHI from clinical text before data distribution.
  • Current deidentification methods are often evaluated on limited datasets, potentially overestimating real-world performance.

Purpose of the Study:

  • To develop and evaluate a novel adversarial method using a large language model (LLM) to re-identify patients from de-identified clinical notes.
  • To assess the effectiveness of state-of-the-art deidentification tools against a re-identification attack.
  • To highlight limitations in current deidentification technologies and provide a tool for iterative improvement.

Main Methods:

  • Developed an adversarial approach using a large language model (LLM) for re-identification.
  • Introduced a De-Identification/Re-Identification (DIRI) method to evaluate deidentification tool performance.
  • Tested the method on clinical data from Weill Cornell Medicine anonymized using Philter, BiLSTM-CRF, and ClinicalBERT.

Main Results:

  • The LLM-based re-identification tool successfully re-identified 9% of clinical notes, even those processed by the most effective deidentification tool (ClinicalBERT).
  • This demonstrates significant weaknesses in current deidentification technologies.
  • The DIRI method provides a robust evaluation framework for deidentification tools.

Conclusions:

  • Existing deidentification technologies exhibit significant vulnerabilities.
  • The developed LLM-based re-identification method can effectively challenge and expose these weaknesses.
  • Continuous improvement and novel approaches are necessary to ensure robust patient privacy in biomedical research.