Jove
Visualize
Contact Us

Related Concept Videos

The Spinal Cord01:54

The Spinal Cord

29.1K
The spinal cord is the body’s major nerve tract of the central nervous system, communicating afferent sensory information from the periphery to the brain and efferent motor information from the brain to the body. The human spinal cord extends from the hole at the base of the skull, or foramen magnum, to the level of the first or second lumbar vertebra.
29.1K
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies
  1. Home
  2. Large Language Models' Responses To Spinal Cord Injury: A Comparative Study Of Performance.
  1. Home
  2. Large Language Models' Responses To Spinal Cord Injury: A Comparative Study Of Performance.

Related Experiment Video

Thoracic Spinal Cord Hemisection Surgery and Open-Field Locomotor Assessment in the Rat
06:44

Thoracic Spinal Cord Hemisection Surgery and Open-Field Locomotor Assessment in the Rat

Published on: June 26, 2019

9.5K

Large Language Models' Responses to Spinal Cord Injury: A Comparative Study of Performance.

Jinze Li1,2, Chao Chang1,2, Yanqiu Li3

  • 1Department of Neurosurgery, Xuanwu Hospital, Capital Medical University, No. 45 Changchun Street, Xicheng District, Beijing, 100053, China.

Journal of Medical Systems
|March 25, 2025

View abstract on PubMed

Summary
This summary is machine-generated.

This study compared four large language models (LLMs) for spinal cord injury (SCI) information. ChatGPT offered the most accurate responses, while Gemini provided the highest quality information.

Keywords:
Accuracy assessmentLarge language modelQuality assessmentReadability assessmentSpinal cord injury

More Related Videos

Author Spotlight: Insight Into Innovations in Spinal Cord Injury Research
06:31

Author Spotlight: Insight Into Innovations in Spinal Cord Injury Research

Published on: January 19, 2024

1.8K
A Contusive Model of Unilateral Cervical Spinal Cord Injury Using the Infinite Horizon Impactor
07:28

A Contusive Model of Unilateral Cervical Spinal Cord Injury Using the Infinite Horizon Impactor

Published on: July 24, 2012

19.5K

Related Experiment Videos

Thoracic Spinal Cord Hemisection Surgery and Open-Field Locomotor Assessment in the Rat
06:44

Thoracic Spinal Cord Hemisection Surgery and Open-Field Locomotor Assessment in the Rat

Published on: June 26, 2019

9.5K
Author Spotlight: Insight Into Innovations in Spinal Cord Injury Research
06:31

Author Spotlight: Insight Into Innovations in Spinal Cord Injury Research

Published on: January 19, 2024

1.8K
A Contusive Model of Unilateral Cervical Spinal Cord Injury Using the Infinite Horizon Impactor
07:28

A Contusive Model of Unilateral Cervical Spinal Cord Injury Using the Infinite Horizon Impactor

Published on: July 24, 2012

19.5K

Area of Science:

  • Medical Informatics
  • Artificial Intelligence in Healthcare
  • Neurology

Background:

  • Spinal cord injury (SCI) presents complex challenges in pathogenesis, treatment, and rehabilitation.
  • Patients increasingly seek online resources for medical information regarding SCI.
  • Large language models (LLMs) show growing potential in patient education and clinical decision support.

Purpose of the Study:

  • To systematically compare the performance of four leading LLMs (ChatGPT-4o, Claude-3.5 Sonnet, Gemini-1.5 Pro, Llama-3.1) in answering spinal cord injury-related questions.
  • To evaluate the quality, readability, accuracy, and comprehensiveness of LLM-generated responses for SCI.
  • To assess the self-correction capabilities of LLMs when prompted for revisions.

Main Methods:

  • Four LLMs were queried with 37 questions covering SCI pathogenesis, risk factors, clinical features, diagnostics, treatments, and prognosis.
  • Response quality was assessed using the Ensuring Quality Information for Patients (EQIP) tool.
  • Readability was measured using Flesch-Kincaid metrics.
  • Accuracy was evaluated by three senior spine surgeons using consensus scoring.
  • Main Results:

    • Gemini-1.5 Pro achieved the highest EQIP scores, indicating superior information quality.
    • ChatGPT-4o demonstrated the highest accuracy, with 83.8% of responses rated as "Good".
    • All LLMs generally exhibited low readability (college-level comprehension) but effectively simplified complex information.
    • LLMs showed significant self-correction capabilities, with accuracy improving substantially after revision prompts.

    Conclusions:

    • While Gemini-1.5 Pro excels in information quality for SCI, ChatGPT-4o provides the most accurate and comprehensive responses.
    • LLMs demonstrate potential for patient education and clinical decision support in SCI, but accuracy remains a critical factor.
    • Further research is needed to optimize LLM performance and ensure reliable medical information delivery for complex conditions like SCI.