The Diagnostic Ability of GPT-3.5 and GPT-4.0 in Surgery: Comparative Analysis
View abstract on PubMed
Summary
This summary is machine-generated.GPT-4.0 demonstrates superior diagnostic accuracy for colon cancer compared to GPT-3.5, showing significant potential as an auxiliary tool for surgeons. Further research is needed to address GPT-4.0
Area Of Science
- Artificial Intelligence in Medicine
- Clinical Decision Support Systems
- Oncology Diagnostics
Background
- ChatGPT shows promise as a clinical diagnostic aid.
- Evaluating AI diagnostic capabilities is crucial for clinical integration.
- This study compares GPT-3.5 and GPT-4.0 performance in diagnosis.
Purpose Of The Study
- To assess GPT-3.5 and GPT-4.0 diagnostic accuracy for colon cancer.
- To compare the diagnostic performance of GPT-3.5 versus GPT-4.0.
- To analyze misdiagnosis causes and evaluate AI as a surgical auxiliary tool.
Main Methods
- 316 intestinal cancer case reports were analyzed (286 valid).
- Case data was translated and input into GPT-3.5 and GPT-4.0.
- Senior surgeons evaluated primary and secondary diagnoses for accuracy.
Main Results
- GPT-4.0 significantly outperformed GPT-3.5 in primary (0.972 vs 0.855) and secondary (0.908 vs 0.617) diagnoses.
- GPT-3.5 struggled with patient history, symptoms, lab, and imaging data.
- GPT-4.0 showed limitations in symptom and lab data interpretation.
Conclusions
- ChatGPT, especially GPT-4.0, has significant diagnostic potential for colon cancer.
- GPT-4.0 accuracy surpasses GPT-3.5, but limitations remain.
- Real-world validation is necessary to enhance AI diagnostic capabilities.

