Automating the Observer OPTION-5 measure of shared decision making: Assessing validity by comparing large language models to human ratings
View abstract on PubMed
Summary
This summary is machine-generated.Large language models (LLMs) show promise in evaluating shared decision-making in clinical consultations. These AI tools can assess clinician-patient dialogue, potentially improving feedback for healthcare professionals.
Area Of Science
- Artificial Intelligence in Healthcare
- Clinical Communication Analysis
- Medical Decision-Making Research
Background
- Observer-based measures of shared decision-making are resource-intensive due to reliance on human raters, limiting routine assessment.
- Generative artificial intelligence (AI) offers potential to enhance the speed and accuracy of evaluations while reducing human burden.
Purpose Of The Study
- To evaluate the performance of large language models (LLMs) from Gemini, GPT, and LLaMA families.
- To assess LLMs' ability to evaluate shared decision-making in early-stage breast cancer surgery consultations.
Main Methods
- Compared LLM-generated scores with trained human raters using the Observer OPTION-5 measure.
- Analyzed 287 anonymized breast cancer consultation transcripts.
- Tested various prompts and evaluated LLMs' ability to distinguish high versus low scoring encounters.
Main Results
- GPT-4o and Gemini-1.5-Pro-002 scores correlated with human ratings (Pearson r ≈ 0.6, p < 0.01).
- LLM performance approximated 75-80% of human inter-rater agreement (r = 0.77).
- LLMs successfully distinguished high- from low-scoring encounters (t > 10, p < 0.01).
Conclusions
- LLMs can evaluate clinician-patient dialogue using existing measures for shared decision-making.
- Prompt development and fine-tuning are crucial for optimizing LLM performance.
- Future research should focus on generalizability and larger datasets to improve model performance.
Related Concept Videos
The process of hypothesis testing based on the traditional method includes calculating the critical value, testing the value of the test statistic using the sample data, and interpreting these values.
First, a specific claim about the population parameter is decided based on the research question and is stated in a simple form. Further, an opposing statement to this claim is also stated. These statements can act as null and alternative hypotheses, out of which a null hypothesis would be a...
Observational studies are a type of analytical study where researchers observe events without any interventions. In other words, the researcher does not influence the response variable or the experiment's outcome.
There are three types of observational studies – Prospective, retrospective, and cross-sectional.
Prospective Study
Prospective studies, also known as longitudinal or cohort studies, are carried out by collecting future data from groups sharing similar characteristics. One...

