Related Videos - Assessing the accuracy and consistency of large language models in triaging social media posts for psychological distress

Michele Settanni ¹, Francesco Quilghini ², Anna Toscano ³, Davide Marengo ⁴

¹Department of Psychology, University of Turin, Via Verdi 10, Torino 10124, Italy. Electronic address: michele.settanni@unito.it.
²Department of Psychology, University of Turin, Via Verdi 10, Torino 10124, Italy. Electronic address: francesco.quilghini@unito.it.
³Department of Psychology, University of Turin, Via Verdi 10, Torino 10124, Italy. Electronic address: anna.toscano@unito.it.
⁴Department of Psychology, University of Turin, Via Verdi 10, Torino 10124, Italy. Electronic address: davide.marengo@unito.it.

Abstract

Advances in artificial intelligence, particularly in natural language processing, offer promising tools for addressing mental health challenges in online contexts, potentially identifying at-risk individuals and informing timely interventions. This study investigates the potential of Large Language Models (LLMs) for automatically triaging social media posts expressing psychological distress. Using a dataset of 425 Italian-language Reddit posts, we compared the triage performance of three state-of-the-art LLMs - ChatGPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Pro - with trained clinician assessment using an adapted version of the Mental Health Triage Scale (MHTS), a validated instrument used in psychiatric screening services. A zero-shot prompting approach, with and without role assignment (simulating a clinician's perspective), evaluated the models' capability to assess intervention urgency. Results revealed that LLMs consistently overestimated urgency compared to human raters, although correlations with human judgments were moderate to strong, with GPT-4o and Claude 3.5 Sonnet demonstrating higher agreement. GPT-4o achieved the best classification performance, highlighting its potential for this task. Claude 3.5 Sonnet showed high sensitivity but lower precision, indicating a tendency toward false positives, while Gemini 1.5 Pro exhibited more balanced but generally lower performance. These findings suggest that while LLMs show promise for mental health triage in social media, their tendency to overestimate urgency and model-specific variations in performance underscore the need for careful interpretation, and human oversight when applying LLMs in mental health contexts.

Assessing the accuracy and consistency of large language models in triaging social media posts for psychological distress

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Polar Histogram Visualization of Acute Stress Disorder Scale Scores for Comprehensive Clinical Assessment

Abstract

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Polar Histogram Visualization of Acute Stress Disorder Scale Scores for Comprehensive Clinical Assessment

Stereotype Content Model

Diagnostic and Statistical Manual of Mental Disorders (DSM)

Theoretical Approaches to Psychological Disorder

Modeling in Therapy

Treatment Strategies for Psychological Disorders

Beck's Cognitive Therapy

ABOUT JoVE

AUTHORS

LIBRARIANS

RESEARCH

EDUCATION

Assessing the accuracy and consistency of large language models in triaging social media posts for psychological distress

Related Experiment Videos These videos have been matched automatically. Contact us if they are not relevant.

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Polar Histogram Visualization of Acute Stress Disorder Scale Scores for Comprehensive Clinical Assessment

Abstract

Related Experiment Videos These videos have been matched automatically. Contact us if they are not relevant.

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Implementation of a Real-Time Psychosis Risk Detection and Alerting System Based on Electronic Health Records using CogStack

Polar Histogram Visualization of Acute Stress Disorder Scale Scores for Comprehensive Clinical Assessment

Related Concept Videos

Stereotype Content Model

Diagnostic and Statistical Manual of Mental Disorders (DSM)

Theoretical Approaches to Psychological Disorder

Modeling in Therapy

Treatment Strategies for Psychological Disorders

Beck's Cognitive Therapy

Share

Related Experiment Videos

These videos have been matched automatically. Contact us if they are not relevant.

Related Experiment Videos

These videos have been matched automatically. Contact us if they are not relevant.