Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: Jun 14, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Phenotyping Prostate Cancer in a National Health System Using Large Language Models.

Michael P Dykstra1,2, Phoebe A Tsao3,4,5, Megan E V Caram3,4,5

  • 1Department of Radiation Oncology, Veterans Affairs Ann Arbor Healthcare System, Ann Arbor, MI.

JCO Clinical Cancer Informatics
|June 12, 2026
PubMed
Summary

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Heterogeneous survival impact of immune-related adverse events in US veterans.

Journal for immunotherapy of cancer·2026
Same author

Maintenance Pemetrexed/Pembrolizumab Versus Pembrolizumab in Non-Small Cell Lung Cancer: A Propensity Score-Weighted Analysis.

JCO oncology practice·2026
Same author

Polygenic risk scores for prediction of immune checkpoint inhibitor thyroid toxicity in diverse populations.

Clinical cancer research : an official journal of the American Association for Cancer Research·2026
Same author

Trial protocol: RadTARGET, a multicenter phase II randomized controlled trial evaluating focal radiotherapy boost with de-intensification of dose to non-suspicious prostate in patients with intermediate- or high-risk prostate cancer.

Clinical and translational radiation oncology·2026
Same author

Germline polygenic score for prostate cancer aggressiveness.

medRxiv : the preprint server for health sciences·2026
Same author

Morning versus afternoon administration of immune checkpoint inhibitors in metastatic non-small-cell lung cancer.

Journal for immunotherapy of cancer·2026
Same journal

Bayesian Methods for Subgroup Efficacy and Safety: Application to Japanese Patients in JAVELIN Renal 101.

JCO clinical cancer informatics·2026
Same journal

Effect of a Multidimensional Digital Health Intervention on Quality of Life in Breast Cancer Survivors: A Randomized Controlled Trial.

JCO clinical cancer informatics·2026
Same journal

Can Small Open-Source Language Models With Retrieval-Augmented Generation Match GPT-4 Performance in Breast Cancer Clinical Decision Support?

JCO clinical cancer informatics·2026
Same journal

Machine Learning Algorithm for the Detection of Tumor Microsatellite Instability Based on Multiomics Biomarkers.

JCO clinical cancer informatics·2026
Same journal

Foundation Model-Driven Regions of Interest Classification and Renaming in Cancer Radiotherapy: A Customizable, Retraining-Free Workflow Across Institutions.

JCO clinical cancer informatics·2026
Same journal

Announcing a New Article Type in <i>JCO Clinical Cancer Informatics</i>: The Resource Report.

JCO clinical cancer informatics·2026
See all related articles
This summary is machine-generated.

Large language models (LLMs) accurately extract prostate cancer data from clinical reports, outperforming traditional methods. Ambiguous language in radiology reports is the main challenge for optimal LLM performance.

Area of Science:

  • Medical Informatics
  • Natural Language Processing
  • Oncology

Background:

  • Extracting prognostic variables from unstructured clinical text is crucial for prostate cancer management.
  • Traditional rule-based natural language processing (NLP) methods have limitations in accurately capturing complex clinical information.
  • Large language models (LLMs) show promise in advancing automated information extraction from clinical narratives.

Purpose of the Study:

  • To evaluate the efficacy of LLMs in extracting key prostate cancer phenotypes from diverse unstructured clinical reports.
  • To compare LLM performance against traditional NLP methods for prognostic variable extraction.
  • To identify challenges and limitations in LLM-based clinical text analysis.

Main Methods:

  • Iterative prompt engineering with few-shot examples was used to develop LLM prompts for 30 phenotypes.

More Related Videos

Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
13:19

Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer

Published on: November 2, 2013

Related Experiment Videos

Last Updated: Jun 14, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer
13:19

Microarray-based Identification of Individual HERV Loci Expression: Application to Biomarker Discovery in Prostate Cancer

Published on: November 2, 2013

  • Data were sourced from pathology (biopsy, radical prostatectomy, TURP) and radiology (MRI, CT, bone scan, PSMA PET/CT) reports from over 130 VA facilities.
  • LLM inference was performed using Llama 3.3 70B or GPT-4o, with performance assessed using accuracy, sensitivity, PPV, NPV, and macro-F1 metrics on independent test sets.
  • Main Results:

    • LLMs achieved near-perfect accuracy in pathology extraction tasks, including total and involved cores from biopsy reports (accuracy >95%).
    • Excellent performance was observed for extracting PIRADS scores, lesion locations, and dimensions from pelvic MRI reports (accuracy >98%).
    • High positive predictive values (PPVs) were achieved for nodal and bone metastases extraction from PSMA PET/CT (PPV >97.9%), with comparable Tc-99m bone scan performance. Lower PPVs were noted for MRI and CT due to ambiguous language.

    Conclusions:

    • LLMs demonstrate high reliability in extracting critical prostate cancer phenotypes across various pathology and radiology report types.
    • The study highlights the potential of LLMs to significantly improve the extraction of prognostic variables from unstructured clinical text.
    • Ambiguous or indeterminate language within radiology reports presents the primary obstacle to achieving optimal LLM performance in clinical data extraction.