Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
Improving Translational Accuracy02:07

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Bridging survival analysis and machine learning to improve healthy life expectancy estimation using PHR records.

NPJ digital medicine·2026
Same author

Integrating new habits and practices data and homecare products into the Creme RIFM aggregate exposure model.

Regulatory toxicology and pharmacology : RTP·2026
Same author

Analysis of inter-brain synchrony in group-based electroencephalography to assess task-dependent interactions.

Frontiers in neuroergonomics·2026
Same author

Comprehensive representation of health-related phenotypes in one million dogs using topic modelling of electronic health records.

Journal of big data·2026
Same author

Multimodal models for skin cancer classification using clinical freetext and dermatoscopic images.

Communications medicine·2026
Same author

Views of Facial Attractiveness of Faces of Individuals With and Without an Intellectual Disability.

Journal of applied research in intellectual disabilities : JARID·2026
Same journal

Integrative proteogenomic analyses identify plasma proteins that impact the risk of ischemic stroke.

Communications medicine·2026
Same journal

A memory retrieval-aversive conditioning procedure durably reduces gaming craving and fronto-insular activation in internet gaming disorder: a randomized controlled trial.

Communications medicine·2026
Same journal

Intraoperative dexmedetomidine reduces postoperative sleep disturbance in older adults undergoing major abdominal surgery: a single-center, randomized, double-blind, placebo-controlled trial.

Communications medicine·2026
Same journal

The landscape of artificial intelligence in neurodegenerative diseases: a systematic review.

Communications medicine·2026
Same journal

Troponin release and mechanism-dependent myocardial injury in a lightning-induced mass casualty incident.

Communications medicine·2026
Same journal

Integrating a national crisis counseling hotline into Iran's primary health care network to provide rapid mental health support during armed conflict.

Communications medicine·2026
See all related articles

Related Experiment Video

Updated: Jun 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Generalizable multilingual medical text anonymization using generative instruction tuning.

Chenghao Xiao1, G Thomas Hudson2,3, Matthew Watson1

  • 1Department of Computer Science, Durham University, Durham, UK.

Communications Medicine
|June 13, 2026
PubMed
Summary
This summary is machine-generated.

This study introduces an annotation-free framework for privacy-preserving medical text anonymization using generative large language models (LLMs). The approach effectively removes sensitive data while preserving clinical meaning across diverse medical domains and languages.

Related Experiment Videos

Last Updated: Jun 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

  • Medical Informatics
  • Natural Language Processing
  • Data Privacy

Background:

  • High-quality medical data is crucial for research but contains sensitive patient information.
  • Current anonymization methods are domain-specific, require manual data, and are difficult to scale.
  • A scalable, privacy-preserving solution is needed for utilizing unstructured clinical text.

Purpose of the Study:

  • To develop a reproducible, annotation-free framework for training and adapting LLM-based medical text anonymization models.
  • To enable privacy-preserving use of medical text across diverse settings and languages.
  • To reduce reliance on manual annotation and real patient data.

Main Methods:

  • Developed a generative medical anonymization model using synthetic data and instruction tuning of generative LLMs.
  • Created an annotation-free framework for training and adapting models.
  • Evaluated performance on synthetic datasets and real-world patient requests, assessing accuracy, recall, precision, and meaning preservation.

Main Results:

  • Generative models trained with the synthetic framework outperformed baseline systems across multiple medical domains.
  • Models achieved high accuracy in anonymizing sensitive information and high fidelity in preserving non-sensitive text.
  • The framework demonstrated effectiveness with small datasets, generalization to unseen fields, and multilingual support without additional training.

Conclusions:

  • The study presents a reproducible, annotation-free approach for effective medical text anonymization.
  • This framework reduces the need for real patient data and lowers adaptation costs.
  • It facilitates broader use of unstructured clinical information for research and service improvement.