Evaluating Large Language Models in Addressing Patient Questions on Endodontic Pain: A Comparative Analysis of Accessible Chatbots
View abstract on PubMed
Summary
This summary is machine-generated.ChatGPT-3.5 provided higher quality and more reliable answers on endodontic pain than Gemini, but was harder to read. Gemini was more readable but less comprehensive, highlighting the need for AI oversight in patient education.
Area Of Science
- Artificial Intelligence in Healthcare
- Medical Informatics
- Patient Education Technology
Background
- Growing patient reliance on large language models (LLMs) for health information necessitates reliability assessments.
- The use of AI in patient education for conditions like endodontic pain is controversial.
- Continuous evaluation of LLM performance in healthcare is crucial for responsible integration.
Purpose Of The Study
- To evaluate and compare the performance of ChatGPT-3.5 and Gemini in responding to patient inquiries about endodontic pain.
- To assess the quality, reliability, and readability of AI-generated responses on endodontic pain.
- To inform the development of AI-driven tools for patient education in dentistry.
Main Methods
- 62 frequently asked questions on endodontic pain were curated and categorized.
- Responses from ChatGPT-3.5 and Gemini were analyzed using standardized tools: Global Quality Score (GQS), reliability metrics, and readability indices (Flesch-Kincaid, Simple Measure of Gobbledygook).
- Statistical analysis was performed to compare the performance of the two LLMs.
Main Results
- ChatGPT-3.5 demonstrated significantly higher overall quality (GQS) and reliability compared to Gemini (P < .001).
- Gemini provided more readable responses (6th-7th grade level) than ChatGPT-3.5 (P < .001), which required a higher reading level.
- Despite better readability, Gemini's answers lacked the depth and completeness of ChatGPT-3.5's responses.
Conclusions
- ChatGPT-3.5 excels in providing high-quality, reliable information on endodontic pain, though its complexity may limit patient accessibility.
- Gemini offers enhanced readability but compromises the comprehensiveness and depth of information.
- Professional oversight is essential for integrating AI tools in healthcare to ensure accurate, accessible, and empathetic patient education.

