Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Radiological Investigation III: Pulmonary Angiogram and PET Scan01:13

Radiological Investigation III: Pulmonary Angiogram and PET Scan

Radiological investigations are paramount in the diagnosis and management of various pulmonary diseases. Two essential investigations are the Pulmonary Angiogram and the Positron Emission Tomography (PET) Scan.
Pulmonary Angiogram
A Pulmonary Angiogram is an invasive procedure involving injecting a contrast medium through a catheter threaded into the pulmonary artery or the right side of the heart to visualize the pulmonary vasculature. Computed Tomography (CT) scans have mainly replaced this...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Leveraging Foundation Models for Histological Grading in Cutaneous Squamous Cell Carcinoma using PathFMTools.

Proceedings of machine learning research·2026
Same author

Patterns of Goals-of-Care Documentation in the Outpatient Oncology Setting: A Retrospective Cohort Study of Older Adults with Advanced Cancer.

Journal of palliative medicine·2026
Same author

Development and validation of risk stratification models for hepatocellular cancer: A framework from the translational liver cancer consortium.

Hepatology (Baltimore, Md.)·2026
Same author

Integrating 730,947 exome sequences with clinical literature improves gene discovery.

medRxiv : the preprint server for health sciences·2026
Same author

Graph neural network modeling of spatial tumor-immune interactions identifies prognostic cellular niches in non‑small cell lung cancer.

NPJ precision oncology·2026
Same author

Towards generalisable and equitable artificial intelligence in pathology.

Journal of clinical pathology·2025
Same journal

Endo-SemiS: Towards Robust Semi-Supervised Image Segmentation for Endoscopic Video.

Proceedings of machine learning research·2026
Same journal

Perspective: Machine Learning for Health Should Consider Social Drivers of Health.

Proceedings of machine learning research·2026
Same journal

Classifying Phonotrauma Severity from Vocal Fold Images with Soft Ordinal Regression.

Proceedings of machine learning research·2026
Same journal

Does Domain-Specific Retrieval Augmented Generation Help LLMs Answer Consumer Health Questions?

Proceedings of machine learning research·2026
Same journal

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential.

Proceedings of machine learning research·2026
Same journal

Fast Calculation of Feature Contributions in Boosting Trees.

Proceedings of machine learning research·2026
See all related articles

Related Experiment Video

Updated: Jun 4, 2026

Three-Dimensional Reconstruction for the Whole Lung with Early Multiple Pulmonary Nodules
07:53

Three-Dimensional Reconstruction for the Whole Lung with Early Multiple Pulmonary Nodules

Published on: October 13, 2023

Beyond Diagnosis: Evaluating Multimodal LLMs for Pathology Localization in Chest Radiographs.

Advait Gosai1, Arun Kavishwar2, Stephanie L McNamara3

  • 1University of California, Berkeley.

Proceedings of Machine Learning Research
|June 3, 2026
PubMed
Summary
This summary is machine-generated.

Multimodal large language models (MLLMs) show potential in medical imaging but struggle with precise pathology localization on chest radiographs. Current models perform below benchmarks, requiring integration with specialized tools for clinical reliability.

Keywords:
Chest RadiographsDisease LocalizationMultimodal LLMs

More Related Videos

Robotic-assisted Bronchoscopy Combined with Multimodal Imaging for Targeted Lung Cryobiopsies
04:10

Robotic-assisted Bronchoscopy Combined with Multimodal Imaging for Targeted Lung Cryobiopsies

Published on: July 19, 2024

Multi-modal Pulmonary Imaging: Using Complementary Information from CT and Hyperpolarized 129Xe MRI to Evaluate Lung Structure-Function
02:09

Multi-modal Pulmonary Imaging: Using Complementary Information from CT and Hyperpolarized 129Xe MRI to Evaluate Lung Structure-Function

Published on: April 12, 2024

Related Experiment Videos

Last Updated: Jun 4, 2026

Three-Dimensional Reconstruction for the Whole Lung with Early Multiple Pulmonary Nodules
07:53

Three-Dimensional Reconstruction for the Whole Lung with Early Multiple Pulmonary Nodules

Published on: October 13, 2023

Robotic-assisted Bronchoscopy Combined with Multimodal Imaging for Targeted Lung Cryobiopsies
04:10

Robotic-assisted Bronchoscopy Combined with Multimodal Imaging for Targeted Lung Cryobiopsies

Published on: July 19, 2024

Multi-modal Pulmonary Imaging: Using Complementary Information from CT and Hyperpolarized 129Xe MRI to Evaluate Lung Structure-Function
02:09

Multi-modal Pulmonary Imaging: Using Complementary Information from CT and Hyperpolarized 129Xe MRI to Evaluate Lung Structure-Function

Published on: April 12, 2024

Area of Science:

  • Artificial Intelligence in Medicine
  • Medical Imaging Analysis
  • Natural Language Processing

Background:

  • Large language models (LLMs) and multimodal LLMs (MLLMs) demonstrate promise in medical diagnosis.
  • Accurate localization of pathological findings is crucial for medical image interpretation, beyond diagnostic capabilities.
  • Evaluating localization abilities offers insights into models' spatial understanding of anatomy and disease.

Purpose of the Study:

  • To systematically assess the pathology localization capabilities of general-purpose MLLMs (GPT-4, GPT-5) and a domain-specific model (MedGemma) on chest radiographs.
  • To compare MLLM performance against a task-specific Convolutional Neural Network (CNN) baseline and a human radiologist benchmark.
  • To analyze model errors and identify areas for improvement in spatial understanding and localization accuracy.

Main Methods:

  • A prompting pipeline was developed, overlaying a spatial grid on chest radiographs to elicit coordinate-based pathology predictions.
  • Two general-purpose MLLMs (GPT-4, GPT-5) and one domain-specific MLLM (MedGemma) were evaluated.
  • Performance was assessed on the CheXlocalize dataset across nine distinct pathologies, comparing results against CNN and radiologist benchmarks.

Main Results:

  • GPT-5 achieved 49.7% localization accuracy, GPT-4 achieved 39.1%, and MedGemma achieved 17.7%, all below the CNN baseline (59.9%) and radiologist benchmark (80.1%).
  • GPT-5's errors were often anatomically plausible but imprecise; GPT-4 struggled with variable pathologies and produced more implausible predictions.
  • MedGemma showed the lowest performance but improved with few-shot prompting, indicating potential for domain-specific fine-tuning.

Conclusions:

  • Current general-purpose MLLMs exhibit limitations in precise pathological localization on chest radiographs, despite potential for anatomical plausibility.
  • Performance gaps highlight the need for task-specific tools and further development to integrate MLLMs reliably into clinical workflows.
  • Future research should focus on enhancing spatial reasoning and localization accuracy in MLLMs for medical imaging applications.