Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Positive Symptoms Schizophrenia: Hallucinations and Delusions

Positive Symptoms Schizophrenia: Hallucinations and Delusions

Schizophrenia is a complex psychiatric disorder characterized by a range of symptoms that significantly impact cognition, behavior, and emotional regulation. Among these, the positive symptoms stand out as they involve the addition or exaggeration of normal mental functions, deviating markedly from typical behavior and perception. Hallucinations and delusions are prominent positive symptoms, each profoundly affecting the individual's experience of reality.
Hallucinations
Hallucinations in...

Positive Symptoms of Schizophrenia: Hallucinations and Delusions

Positive Symptoms of Schizophrenia: Hallucinations and Delusions

Schizophrenia is a complex mental health disorder that can manifest with various positive symptoms, including thought, movement, and behavior disorders. These symptoms significantly disrupt cognitive and motor functions, leading to profound effects on an individual's ability to engage with the world.
Thought Disorders
Disorganized and unusual thought processes mark thought disorders in schizophrenia. One key feature is disorganized speech, where an individual's conversation includes...

Hallucinogens and Psychedelics

Hallucinogens and Psychedelics

Hallucinogens are psychoactive substances that profoundly alter perceptual experiences, generating unreal visual and sensory images. Often referred to as psychedelic drugs — a term derived from the Greek words "psyche" (mind) and "delos" (revealing) — these substances include marijuana and lysergic acid diethylamide (LSD), among others. These drugs vary in intensity and effects.
Marijuana, derived from the dried leaves and flowers of the hemp plant, contains...

Psychosis and Antipsychotic Drugs: Overview

Psychosis and Antipsychotic Drugs: Overview

The term "psychosis" refers to a spectrum of mental disorders characterized by abnormal thoughts, perceptions, and behaviors. It can manifest as mood disorders, dementia, delirium with psychotic features, substance-induced psychosis with psychotic features, brief psychotic disorder, delusional disorder, schizoaffective disorder, and schizophrenia. Among all these disorders, schizophrenia is the most common psychotic disorder, affecting 1% of the worldwide population. Psychotic...

Classification of Illness

Classification of Illness

The meaning of illness is individualized to each person who experiences an alteration in health. In contrast, disease is a medical term indicating a pathological change in the structure and function of the body or mind. It is a condition that has specific symptoms and boundaries.
An illness is a response to a disease in which the person's level of functioning is changed compared with a previous level. The general classification of illness includes acute and chronic.
Acute illness is severe...

Higher Mental Functions of the Brain: Language

Higher Mental Functions of the Brain: Language

Language is a system of communication that allows the expression of thoughts, ideas, and feelings. The brain processes language in both hemispheres.
Language formation and comprehension take place in the dominant hemisphere. The dominant hemisphere is responsible for understanding the meaning of spoken, written, or sign language, as well as the ability to communicate. For most people, the left hemisphere is the dominant one. The right hemisphere, then, gives tone and emotional context to the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Automated evaluation can distinguish the good and bad AI responses to patient questions about hospitalization.

NPJ digital medicine·2026

Same author

BioACE: An Automated Framework for Biomedical Answer and Citation Evaluations.

ArXiv·2026

Same author

A Dataset and Resources for Identifying Patient Health Literacy Information from Clinical Notes.

ArXiv·2026

Same author

A Dataset and Benchmark for Consumer Healthcare Question Summarization.

ArXiv·2026

Same author

A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization.

Scientific data·2026

Same author

Lessons from the TREC Plain Language Adaptation of Biomedical Abstracts (PLABA) track.

Journal of biomedical informatics·2026

Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026

Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026

Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026

Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026

Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026

Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 30, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Quantifying Hallucinations in Language Language Models on Medical Textbooks.

Brandon C Colelough^1,2, Davis Bartels¹, Dina Demner-Fushman¹

¹National Institutes of Health, National Library of Medicine, Bethesda, MD, US.

|April 29, 2026

Summary

This summary is machine-generated.

Large language models (LLMs) frequently hallucinate incorrect information in medical question answering (QA). Lower hallucination rates in LLMs correlated with higher clinician preference, indicating a need for improved factual accuracy in AI.

More Related Videos

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

A Comprehensive Protocol for Manual Segmentation of the Medial Temporal Lobe Structures

A Comprehensive Protocol for Manual Segmentation of the Medial Temporal Lobe Structures

Published on: July 2, 2014

Related Experiment Videos

Last Updated: Apr 30, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Objectification of Tongue Diagnosis in Traditional Medicine, Data Analysis, and Study Application

Published on: April 14, 2023

A Comprehensive Protocol for Manual Segmentation of the Medial Temporal Lobe Structures

A Comprehensive Protocol for Manual Segmentation of the Medial Temporal Lobe Structures

Published on: July 2, 2014

Area of Science:

Natural Language Processing
Artificial Intelligence in Medicine
Medical Question Answering

Background:

Hallucinations, or factually incorrect claims by large language models (LLMs), pose a significant challenge in natural language processing.
Current medical question answering (QA) benchmarks seldom assess LLM hallucination against a fixed evidence source.

Purpose of the Study:

To quantify hallucination prevalence in textbook-grounded medical QA.
To compare hallucination rates and clinician preferences across different LLMs.

Main Methods:

Experiment 1: Assessed hallucination frequency of LLaMA-70B-Instruct on novel medical QA prompts with provided passages.
Experiment 2: Evaluated hallucination rates and clinician preference for responses from multiple LLMs.
Clinician agreement was measured using quadratic weighted kappa and Kendall's tau-b.

Main Results:

LLaMA-70B-Instruct exhibited a 19.7% hallucination rate in experiment one, despite 98.8% of responses being deemed highly plausible.
Experiment two showed a negative correlation between hallucination rates and clinician usefulness scores (ρ=-0.71, p=0.058).
High inter-rater reliability was observed among clinicians.

Conclusions:

LLMs demonstrate a notable tendency to hallucinate in medical QA tasks, even when responses appear plausible.
Reducing hallucination rates in LLMs is crucial for improving their utility and trustworthiness in clinical settings.
Further research is needed to develop effective mitigation strategies for LLM hallucinations.