Accuracy of Large Language Model-based Automatic Calculation of Ovarian-Adnexal Reporting and Data System MRI Scores from Pelvic MRI Reports
- Rajesh Bhayana 1, Ankush Jajodia 1, Tanya Chawla 1, Yangqing Deng 2, Genevieve Bouchard-Fortier 3,4, Masoom Haider 2, Satheesh Krishna 2
- Rajesh Bhayana 1, Ankush Jajodia 1, Tanya Chawla 1
- 1University Medical Imaging Toronto, Joint Department of Medical Imaging, University Health Network, Mount Sinai Hospital and Women's College Hospital, Department of Medical Imaging, University of Toronto, Toronto General Hospital, 200 Elizabeth St, Peter Munk Building, 1st Fl, Toronto, ON, Canada M5G 24C.
- 2Department of Biostatistics, University Health Network, Toronto, Canada.
- 3Department of Obstetrics and Gynecology, University of Toronto, Toronto, Canada.
- 4Division of Gynecologic Oncology, Princess Margaret Cancer Centre, University Health Network and Sinai Health System, Toronto, Canada.
- 0University Medical Imaging Toronto, Joint Department of Medical Imaging, University Health Network, Mount Sinai Hospital and Women's College Hospital, Department of Medical Imaging, University of Toronto, Toronto General Hospital, 200 Elizabeth St, Peter Munk Building, 1st Fl, Toronto, ON, Canada M5G 24C.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.
View abstract on PubMed
Summary
This summary is machine-generated.A hybrid large language model (LLM) strategy accurately assigned Ovarian-Adnexal Reporting and Data System (O-RADS) MRI scores from reports, outperforming both LLM-only and radiologist performance.
Area Of Science
- Radiology
- Artificial Intelligence
- Oncology
Background
- Ovarian-Adnexal Reporting and Data System (O-RADS) for MRI aids malignancy risk assessment but faces inconsistent radiologist adoption.
- Automating O-RADS score assignment from reports can enhance adoption and diagnostic accuracy.
Purpose Of The Study
- To assess the accuracy of optimized large language models (LLMs) in automatically calculating O-RADS scores from MRI reports.
Main Methods
- A retrospective study evaluated two LLM strategies: GPT-4 with O-RADS rules (LLM-only) and a hybrid approach combining GPT-4 feature classification with a deterministic formula.
- Accuracy was compared against radiologist-assigned scores using McNemar test on 284 pelvic MRI reports.
Main Results
- The hybrid LLM strategy achieved 97% accuracy in assigning O-RADS MRI scores, surpassing the LLM-only model (90%) and original radiologist scores (88%).
- The hybrid model demonstrated superior performance even on reports from before O-RADS implementation.
Conclusions
- A hybrid LLM application integrating LLM feature extraction with deterministic logic offers a highly accurate method for automated O-RADS MRI score assignment.
- This approach enhances diagnostic consistency and accuracy compared to existing methods.
Related Experiment Videos
Contact us if these videos are not relevant.
Contact us if these videos are not relevant.

