Capability of multimodal large language models to interpret pediatric radiological images
View abstract on PubMed
Summary
This summary is machine-generated.Multimodal large language models (LLMs) show limited accuracy in interpreting pediatric radiology images. These artificial intelligence tools are not yet ready for clinical use in this specialized field.
Area Of Science
- Artificial Intelligence in Medical Imaging
- Pediatric Radiology Research
- Machine Learning in Healthcare
Background
- Limited artificial intelligence (AI) development exists for pediatric radiology.
- New multimodal large language models (LLMs) can process text, image, and video inputs.
- LLMs theoretically could interpret radiological images, including pediatric cases.
Purpose Of The Study
- To evaluate the diagnostic capabilities of multimodal LLMs on pediatric radiological images.
- Assessing the accuracy of advanced AI models in a specialized medical field.
Main Methods
- Thirty significant pediatric radiology cases were used, totaling 90 images.
- Three leading multimodal LLMs (GPT-4, Gemini 1.5 Pro, Claude 3 Opus) were tested.
- AI interpretations were independently verified by a resident and attending physician.
Main Results
- The AI models achieved correct diagnoses for only 27.8% of images.
- Partial correctness was observed in 13.3% of cases.
- A significant majority of interpretations (58.9%) were incorrect.
Conclusions
- Current multimodal LLMs are not sufficiently accurate for interpreting pediatric radiological images.
- Further AI development is needed before clinical application in pediatric radiology.

