Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

14.0K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
14.0K
Improving Translational Accuracy02:07

Improving Translational Accuracy

3.5K
3.5K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Temporal trends in behavioural risk factors for cancers with rising incidence in younger adults: an analysis of population-based data in England.

BMJ oncology·2026
Same author

Ionising radiation and cancer: a UN review of the recent epidemiological evidence.

The Lancet. Oncology·2026
Same author

A Randomized, Phase II Clinical Trial of FLT-PET and FDG-PET for Early Response Assessment of Neoadjuvant Systemic Therapy in Triple Negative Breast Cancer.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026
Same author

Exposure and impact: highlights from the second scientific conference and recent activities of the International Society of Radiation Epidemiology and Dosimetry (ISoRED).

Journal of radiological protection : official journal of the Society for Radiological Protection·2026
Same author

Clinicopathologic and molecular predictors of survival in BRCA-deficient tubo-ovarian high-grade serous carcinoma.

Nature communications·2026
Same author

Modification of RECIST 1.1 criteria for assessing response in breast tumours treated with radiation therapy using multiparametric breast MRI: Radiology and oncology perspective.

Breast (Edinburgh, Scotland)·2026
Same journal

Erratum to "Pathologists in Venice - Real world cases for an immersive training experience": Education, gaming, and show. <i>Journal of Pathology Informatics</i>, Volume 17, 2025, 100418.

Journal of pathology informatics·2026
Same journal

Erratum to PIRO: A web-based search platform for pathology reports, leveraging large language models to generate discrete searchable insights. <i>Journal of Pathology Informatics</i>, Volume 17, 2025, 100436.

Journal of pathology informatics·2026
Same journal

Erratum regarding missing Declaration of Competing Interest statements in previously published articles.

Journal of pathology informatics·2026
Same journal

An integrated AI pipeline for automated cytogenetic analysis of bone marrow karyograms in hematological malignancies: A Pix2Pix enhancement and deep learning detection approach.

Journal of pathology informatics·2026
Same journal

Deployment of AI-driven automated quality control of whole-slide images in a large tertiary cancer center.

Journal of pathology informatics·2026
Same journal

Comparative analysis of whole-slide scanner tissue detection algorithms: Implications for scan area, scan time, and file size in high-volume digital pathology workflows.

Journal of pathology informatics·2026
See all related articles

Related Experiment Video

Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

994

Leveraging large language models for structured information extraction from pathology reports.

Jeya Balaji Balasubramanian1, Daniel Adams2,3, Ioannis Roxanis4

  • 1Division of Cancer Epidemiology and Genetics, National Cancer Institute, 9609 Medical Center Dr, NCI Shady Grove, Room 7E554, Rockville, MD 20850, USA.

Journal of Pathology Informatics
|December 4, 2025
PubMed
Summary
This summary is machine-generated.

Large language models (LLMs) achieve human-level accuracy in extracting structured data from breast cancer histopathology reports. This automated approach enhances data accessibility for clinical research, offering a scalable alternative to manual extraction.

Keywords:
Artificial intelligenceInformation storage and retrievalNatural language processingPathologySemantic web

More Related Videos

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports
07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

2.1K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K

Related Experiment Videos

Last Updated: Jan 9, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

994
A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports
07:35

A Knowledge Graph Approach to Elucidate the Role of Organellar Pathways in Disease via Biomedical Reports

Published on: October 13, 2023

2.1K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K

Area of Science:

  • Computational pathology
  • Natural Language Processing
  • Medical Informatics

Background:

  • Structured information extraction from unstructured histopathology reports is crucial for clinical research data accessibility.
  • Manual extraction is time-consuming and limits scalability.
  • Large language models (LLMs) offer automated extraction via zero-shot prompting, eliminating the need for labeled data or training.

Purpose of the Study:

  • To evaluate the accuracy of LLMs in extracting structured information from breast cancer histopathology reports.
  • To compare LLM performance against manual extraction by a trained human annotator.

Main Methods:

  • Developed the Medical Report Information Extractor web application utilizing LLMs.
  • Created a gold-standard dataset for evaluation.
  • Assessed five LLMs, including GPT-4o and Llama 3 models, on 111 breast cancer histopathology reports, extracting 51 pathology features.

Main Results:

  • Llama 3.1 405B (94.7% accuracy) and GPT-4o (96.1%) demonstrated comparable accuracy to the human annotator (95.4%).
  • Llama 3.1 70B (91.6%) performed below human accuracy but offers a viable self-hosting option due to lower computational needs.

Conclusions:

  • An open-source tool for structured information extraction achieved expert human-level accuracy using state-of-the-art LLMs.
  • The tool is customizable via natural language and promotes data standardization, accessibility, and interoperability for analytics.