Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

  1. Home
  2. Research Domains
  3. Information And Computing Sciences
  4. Artificial Intelligence
  5. Knowledge Representation And Reasoning
  6. Evaluating Generative Ai Large Language Models For Urticaria Management: A Comparative Analysis Of Deepseek-r1 And Chatgpt-4o

Evaluating Generative AI Large Language Models for Urticaria Management: A Comparative Analysis of DeepSeek-R1 and ChatGPT-4o

Mengyao Yang1, Jingchen Liang1, Luyue Zhang1

  • 1Department of Dermatology, the Second Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China.

Clinical and Translational Allergy
|November 27, 2025

Related Experiment Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

994
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

3.2K

View abstract on PubMed

Summary
This summary is machine-generated.

DeepSeek-R1 demonstrates superior performance over ChatGPT-4o in answering urticaria-related questions, offering greater accuracy and clinical feasibility for both medical professionals and patients seeking reliable health information.

Area of Science:

  • Artificial Intelligence in Medicine
  • Dermatology
  • Natural Language Processing

Background:

  • Urticaria is a common global health condition requiring accurate, up-to-date information for patients and dermatologists.
  • Existing search engines and AI models often provide suboptimal or unverified medical information.
  • The reliability of AI-generated medical content, particularly for conditions like urticaria, needs thorough investigation.

Purpose of the Study:

  • To evaluate and compare the performance of two leading AI models, ChatGPT-4o and DeepSeek-R1, in responding to urticaria-related queries.
  • To assess AI-generated responses based on key criteria including simplicity, accuracy, professionalism, clinical feasibility, comprehensibility, and completeness.

Main Methods:

  • An e-Delphi procedure was used to create and refine a comprehensive set of urticaria questions and an evaluation framework.
Keywords:
AIChatGPT‐4oDeepSeek‐R1large language model

Related Experiment Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

994
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application
05:56

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

3.2K
  • ChatGPT-4o and DeepSeek-R1 were prompted with these questions, and their responses were collected.
  • A single-blind comparative assessment was performed by 67 participants (29 dermatologists, 38 non-dermatologists) evaluating response quality.
  • Main Results:

    • DeepSeek-R1 significantly outperformed ChatGPT-4o across multiple metrics, including simplicity, accuracy, completeness, professionalism, and clinical feasibility, as rated by dermatologists.
    • Non-dermatologists found DeepSeek-R1's responses more concise and comprehensible.
    • DeepSeek-R1 provided error-free answers aligned with guidelines, while ChatGPT-4o contained errors in three clinical questions.

    Conclusions:

    • Rigorous evaluation of AI-generated medical content is crucial for ensuring reliability and safe application in healthcare.
    • DeepSeek-R1 exhibits higher potential than ChatGPT-4o for both clinical and patient use in addressing urticaria-related inquiries.
    • The findings underscore the importance of selecting and validating AI tools for medical information dissemination.
    urticaria