Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Bridging survival analysis and machine learning to improve healthy life expectancy estimation using PHR records.

NPJ digital medicine·2026

Same author

Integrating new habits and practices data and homecare products into the Creme RIFM aggregate exposure model.

Regulatory toxicology and pharmacology : RTP·2026

Same author

Analysis of inter-brain synchrony in group-based electroencephalography to assess task-dependent interactions.

Frontiers in neuroergonomics·2026

Same author

Comprehensive representation of health-related phenotypes in one million dogs using topic modelling of electronic health records.

Journal of big data·2026

Same author

Multimodal models for skin cancer classification using clinical freetext and dermatoscopic images.

Communications medicine·2026

Same author

Views of Facial Attractiveness of Faces of Individuals With and Without an Intellectual Disability.

Journal of applied research in intellectual disabilities : JARID·2026

Same journal

Circulating monocyte gene expression profiles associated with cardiac remodeling and incident heart failure in the Multi-Ethnic Study of Atherosclerosis.

Communications medicine·2026

Same journal

Impact of methicillin resistance on mortality in Staphylococcus aureus endocarditis: a systematic review and meta-analysis.

Communications medicine·2026

Same journal

Clinical benefits of tirzepatide in patients with steatotic liver disease and cardiometabolic dysfunction.

Communications medicine·2026

Same journal

Neuropsychiatric association of tirzepatide and semaglutide in obesity with and without type 2 diabetes.

Communications medicine·2026

Same journal

Systematic surveillance of Carbapenemase-producing Enterobacterales reveals persistent spread of IMP-4 IncM2 plasmids in New Caledonia.

Communications medicine·2026

Same journal

Machine learning classification and regional differentiation of neuropathologically-confirmed Alzheimer's disease and comorbid Lewy body disease.

Communications medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Generalizable multilingual medical text anonymization using generative instruction tuning.

Chenghao Xiao¹, G Thomas Hudson^2,3, Matthew Watson¹

¹Department of Computer Science, Durham University, Durham, UK.

Communications Medicine

|June 13, 2026

Summary

This summary is machine-generated.

This study introduces an annotation-free framework for privacy-preserving medical text anonymization using generative large language models (LLMs). The approach effectively removes sensitive data while preserving clinical meaning across diverse medical domains and languages.

Related Experiment Videos

Last Updated: Jun 16, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Area of Science:

Medical Informatics
Natural Language Processing
Data Privacy

Background:

High-quality medical data is crucial for research but contains sensitive patient information.
Current anonymization methods are domain-specific, require manual data, and are difficult to scale.
A scalable, privacy-preserving solution is needed for utilizing unstructured clinical text.

Purpose of the Study:

To develop a reproducible, annotation-free framework for training and adapting LLM-based medical text anonymization models.
To enable privacy-preserving use of medical text across diverse settings and languages.
To reduce reliance on manual annotation and real patient data.

Main Methods:

Developed a generative medical anonymization model using synthetic data and instruction tuning of generative LLMs.
Created an annotation-free framework for training and adapting models.
Evaluated performance on synthetic datasets and real-world patient requests, assessing accuracy, recall, precision, and meaning preservation.

Main Results:

Generative models trained with the synthetic framework outperformed baseline systems across multiple medical domains.
Models achieved high accuracy in anonymizing sensitive information and high fidelity in preserving non-sensitive text.
The framework demonstrated effectiveness with small datasets, generalization to unseen fields, and multilingual support without additional training.

Conclusions:

The study presents a reproducible, annotation-free approach for effective medical text anonymization.
This framework reduces the need for real patient data and lowers adaptation costs.
It facilitates broader use of unstructured clinical information for research and service improvement.