Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Improving Translational Accuracy

Improving Translational Accuracy

Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...

Improving Translational Accuracy

Improving Translational Accuracy

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Temporal trends in behavioural risk factors for cancers with rising incidence in younger adults: an analysis of population-based data in England.

BMJ oncology·2026

Same author

Ionising radiation and cancer: a UN review of the recent epidemiological evidence.

The Lancet. Oncology·2026

Same author

A Randomized, Phase II Clinical Trial of FLT-PET and FDG-PET for Early Response Assessment of Neoadjuvant Systemic Therapy in Triple Negative Breast Cancer.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026

Same author

Exposure and impact: highlights from the second scientific conference and recent activities of the International Society of Radiation Epidemiology and Dosimetry (ISoRED).

Journal of radiological protection : official journal of the Society for Radiological Protection·2026

Same author

Clinicopathologic and molecular predictors of survival in BRCA-deficient tubo-ovarian high-grade serous carcinoma.

Nature communications·2026

Same author

Modification of RECIST 1.1 criteria for assessing response in breast tumours treated with radiation therapy using multiparametric breast MRI: Radiology and oncology perspective.

Breast (Edinburgh, Scotland)·2026

Same journal

Erratum to "Pathologists in Venice - Real world cases for an immersive training experience": Education, gaming, and show. <i>Journal of Pathology Informatics</i>, Volume 17, 2025, 100418.

Journal of pathology informatics·2026

Same journal

Erratum to PIRO: A web-based search platform for pathology reports, leveraging large language models to generate discrete searchable insights. <i>Journal of Pathology Informatics</i>, Volume 17, 2025, 100436.

Journal of pathology informatics·2026

Same journal

Erratum regarding missing Declaration of Competing Interest statements in previously published articles.

Journal of pathology informatics·2026

Same journal

An integrated AI pipeline for automated cytogenetic analysis of bone marrow karyograms in hematological malignancies: A Pix2Pix enhancement and deep learning detection approach.

Journal of pathology informatics·2026

Same journal

Deployment of AI-driven automated quality control of whole-slide images in a large tertiary cancer center.

Journal of pathology informatics·2026

Same journal

Comparative analysis of whole-slide scanner tissue detection algorithms: Implications for scan area, scan time, and file size in high-volume digital pathology workflows.

Journal of pathology informatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Leveraging large language models for structured information extraction from pathology reports.

Jeya Balaji Balasubramanian¹, Daniel Adams^2,3, Ioannis Roxanis⁴

¹Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Dr, NCI Shady Grove, Room 7E554, Rockville, MD 20850, USA.

Journal of Pathology Informatics

|December 4, 2025

Summary

This summary is machine-generated.

Large language models (LLMs) achieve human-level accuracy in extracting structured data from breast cancer histopathology reports. This automated approach enhances data accessibility for clinical research, offering a scalable alternative to manual extraction.

Keywords:

Artificial intelligence Information storage and retrieval Natural language processing Pathology Semantic web

More Related Videos

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Related Experiment Videos

Last Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

Area of Science:

Computational pathology
Natural Language Processing
Medical Informatics

Background:

Structured information extraction from unstructured histopathology reports is crucial for clinical research data accessibility.
Manual extraction is time-consuming and limits scalability.
Large language models (LLMs) offer automated extraction via zero-shot prompting, eliminating the need for labeled data or training.

Purpose of the Study:

To evaluate the accuracy of LLMs in extracting structured information from breast cancer histopathology reports.
To compare LLM performance against manual extraction by a trained human annotator.

Main Methods:

Developed the Medical Report Information Extractor web application utilizing LLMs.
Created a gold-standard dataset for evaluation.
Assessed five LLMs, including GPT-4o and Llama 3 models, on 111 breast cancer histopathology reports, extracting 51 pathology features.

Main Results:

Llama 3.1 405B (94.7% accuracy) and GPT-4o (96.1%) demonstrated comparable accuracy to the human annotator (95.4%).
Llama 3.1 70B (91.6%) performed below human accuracy but offers a viable self-hosting option due to lower computational needs.

Conclusions:

An open-source tool for structured information extraction achieved expert human-level accuracy using state-of-the-art LLMs.
The tool is customizable via natural language and promotes data standardization, accessibility, and interoperability for analytics.