Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Video

Updated: May 29, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Automated RECIST tumor response classification through prompt-guided large language models.

Markus Mergen1,2, Felix Busch3, Andreas P Sauter3

  • 1Department of Diagnostic and Interventional Radiology, Technical University of Munich, School of Medicine and Health, Klinikum rechts der Isar, TUM University Hospital, 81675, Munich, Germany. markus.mergen@tum.de.

Scientific Reports
|May 27, 2026
PubMed
Summary

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Toward a Biopsy-Free Diagnosis of Prostate Cancer: Potential of Combined <sup>18</sup>F-Flotufolastat PSMA PET and mpMRI.

Journal of nuclear medicine : official publication, Society of Nuclear Medicine·2026
Same author

Erratum for: Associations of MRI-derived Paraspinal IMAT and LMM with Cardiometabolic Risk Factors: Results from a German Cohort.

Radiology·2026
Same author

Evaluating accuracy and reasoning capabilities of large language models for acute ischemic stroke management.

Journal of neurointerventional surgery·2026
Same author

GPT-4.1 and Llama 3.3 70 fail to detect clinically relevant errors in radiology reports in zero-shot evaluation.

European radiology·2026
Same author

Performing Best When Needed Least: Reader Experience Shapes Accuracy Gains in Large Language Model-assisted Brain MRI Differential Diagnosis.

Radiology·2026
Same author

Advanced X-Ray Imaging Technology.

Recent results in cancer research. Fortschritte der Krebsforschung. Progres dans les recherches sur le cancer·2026
Same journal

Therapeutic potential of crude protein extracts from two Egyptian freshwater snails Lanistes carinatus and Bellamya unicolor.

Scientific reports·2026
Same journal

Microbial contamination of donor corneas and post-keratoplasty endophthalmitis: a comparison between Japanese and U.S. eye banks using cold storage.

Scientific reports·2026
Same journal

Prevalence and contributing factors of virological non-suppression among adult patients on first-line antiretroviral therapy in tertiary hospitals in Ethiopia.

Scientific reports·2026
Same journal

An in vitro comparison of color stability between alkasite and different restorative materials in various staining solutions.

Scientific reports·2026
Same journal

Toward accessible mRNA LNP formulation: systematic evaluation of mixing strategies and key parameters.

Scientific reports·2026
Same journal

A network analysis of personality traits, mentalizing, and psychological health in Chinese college students.

Scientific reports·2026
See all related articles
This summary is machine-generated.

An offline large language model (LLM) accurately classified oncology radiology reports using prompt strategies. Chain-of-thought prompting achieved the best results for tumor response assessment (Response Evaluation Criteria in Solid Tumors) while ensuring data privacy.

Area of Science:

  • Medical Imaging and Radiology
  • Artificial Intelligence in Healthcare
  • Oncology

Background:

  • Accurate tumor response assessment is crucial for cancer treatment evaluation.
  • Manual classification of radiology reports can be time-consuming and prone to variability.
  • Large language models (LLMs) show potential for automating clinical text analysis.

Purpose of the Study:

  • To evaluate an offline, general-purpose LLM's ability to classify radiology reports according to Response Evaluation Criteria in Solid Tumors (RECIST) guidelines.
  • To assess the impact of different prompting strategies (zero-shot, few-shot, chain-of-thought) on classification accuracy.
  • To ensure privacy-preserving tumor response assessment without model fine-tuning.

Main Methods:

  • An in-house, offline LLaMA-3.3 (70B) model was used to process CT imaging reports from oncology patients.

Related Experiment Videos

Last Updated: May 29, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

  • Reports were classified into RECIST categories (Baseline, Complete Response, Partial Response, Stable Disease, Progressive Disease) using three prompting strategies.
  • Model performance was benchmarked against expert labels using accuracy, precision, recall, and F1 scores.
  • Main Results:

    • The LLM achieved strong classification performance across all prompting strategies.
    • Chain-of-thought prompting yielded the best results, with a micro F1 score of 0.81.
    • Model predictions showed good alignment with human expert assessments.
    • The offline system maintained strict data privacy compliance.

    Conclusions:

    • Prompt-driven LLMs can accurately and reliably classify tumor response categories from real-world radiology reports.
    • Offline LLM deployment, coupled with optimized prompting, offers a scalable and privacy-preserving solution for oncology report interpretation.
    • This approach has the potential to enhance consistency and efficiency in clinical decision support.