Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Influence of Adverse Childhood Experiences on Trauma Informed Care Among Primary Care Providers: A Cross-Sectional Study.

Inquiry : a journal of medical care organization, provision and financing·2025
Same author

Psychometric Evaluation of the Trauma-Informed Care Provider Assessment Tool.

Health services research and managerial epidemiology·2024
Same author

COVID-19 Health Beliefs Regarding Mask Wearing and Vaccinations on Twitter: Deep Learning Approach.

JMIR infodemiology·2022
Same author

Life experience pathways to college student emotional and mental health: A structural equation model.

Journal of American college health : J of ACH·2022
Same author

Family well-being and individual mental health in the early stages of COVID-19.

Families, systems & health : the journal of collaborative family healthcare·2021
Same author

Protection Motivation During COVID-19: A Cross-Sectional Study of Family Health, Media, and Economic Influences.

Health education & behavior : the official publication of the Society for Public Health Education·2021
Same journal

How Does That Large Language Model Make You Feel?

Journal of medical Internet research·2026
Same journal

Transformation Versus Innovation in Digital Health Care and the Future of Clinical AI.

Journal of medical Internet research·2026
Same journal

Building a Malaria Intelligence System for Real-Time Prediction and Data-Driven Intervention Planning.

Journal of medical Internet research·2026
Same journal

Therapeutic Interaction Features of AI Chatbots in Depression Interventions: Systematic Review and Meta-Analysis.

Journal of medical Internet research·2026
Same journal

Large Language Model Versus Multidisciplinary Team: Feasibility Study of Pancreatic Cancer Management Recommendations.

Journal of medical Internet research·2026
Same journal

Centers for Medicare & Medicaid Services to Launch Landmark ACCESS Program.

Journal of medical Internet research·2026
See all related articles

Related Experiment Video

Updated: Jan 14, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Assessing Large Language Models in Building a Structured Dataset From AskDocs Subreddit Data: Methodological Study.

Quinn Snell1, Chase Westhoff1, John Westhoff2

  • 1Brigham Young University, 3361 TMCB, Provo, UT, 84602, United States, 1 8014225098.

Journal of Medical Internet Research
|October 22, 2025
PubMed
Summary
This summary is machine-generated.

Large language models (LLMs) effectively extract health information from social media, matching human accuracy. This validates LLMs for analyzing digital health communications and online user behavior.

Keywords:
Redditartificial intelligencedata extractionlarge language modelsunstructured text analysis

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

334

Related Experiment Videos

Last Updated: Jan 14, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
A User-friendly and Powerful R Analysis of Large-scale Datasets
10:56

A User-friendly and Powerful R Analysis of Large-scale Datasets

Published on: November 4, 2025

334

Area of Science:

  • Digital Health
  • Natural Language Processing
  • Computational Social Science

Background:

  • The subreddit r/AskDocs is a key platform for digital health consultations.
  • Analyzing unstructured user-generated content from forums like r/AskDocs is challenging.
  • Large language models (LLMs) offer advanced tools for extracting health information from social media.

Purpose of the Study:

  • To evaluate the efficacy of LLMs in transforming unstructured r/AskDocs data into a structured format.
  • To compare LLM data extraction performance against human annotators.
  • To assess the alignment of LLM-based data extraction with human cognitive processes.

Main Methods:

  • Data extraction from 2800 r/AskDocs posts using human annotators (medical students) and LLMs.
  • Human annotation included demographics, inquiry type, proxy relationship, chronic conditions, and consultation status.
  • LLM data extraction utilized engineered prompts (JSON, few-shot) with models like Llama 3, Genna, and GPT; Cohen κ assessed inter-annotator reliability.

Main Results:

  • Llama 3 70B (7 few-shot examples) and GPT-4 (2 few-shot examples) achieved the highest accuracy (87.4%) against the human-annotated gold standard.
  • Llama 3 70B demonstrated superior performance in coding health-related content.
  • GPT-4 excelled in extracting demographic information from unstructured posts.

Conclusions:

  • LLMs demonstrate comparable performance to human annotators in extracting demographic and health information from social media health forums.
  • This study validates LLMs as reliable tools for analyzing digital health communications.
  • LLMs show potential for advancing methodologies in digital research by understanding online behaviors and interactions.