Jove
Visualize
Contact Us

Related Concept Videos

Improving Translational Accuracy02:07

Improving Translational Accuracy

3.6K
3.6K
Improving Translational Accuracy02:07

Improving Translational Accuracy

14.1K
Base complementarity between the three base pairs of mRNA codon and the tRNA anticodon is not a failsafe mechanism. Inaccuracies can range from a single mismatch to no correct base pairing at all. The free energy difference between the correct and nearly correct base pairs can be as small as 3 kcal/ mol. With complementarity being the only proofreading step, the estimated error frequency would be one wrong amino acid in every 100 amino acids incorporated. However, error frequencies observed in...
14.1K
Bias in Epidemiological Studies01:29

Bias in Epidemiological Studies

1.3K
Biases can arise at various stages of research, from study design and data collection to analysis and interpretation. Recognizing and addressing these biases is essential to ensure the validity and reliability of epidemiological findings.Broadly speaking, biases in epidemiology fall into three main categories: selection bias, information bias, and confounding. A more detailed description of possible biases is:  
1.3K
Stereotype Content Model02:16

Stereotype Content Model

15.4K
The Stereotype Content Model (SCM) was first proposed by Susan Fiske and her colleagues (Fiske, Cuddy, Glick & Xu, 2002; see also Fiske, 2012 and Fiske, 2017). The SCM specifies that when someone encounters a new group, they will stereotype them based on two metrics: warmth—or that group’s perceived intent, and how likely they are to provide help or inflict harm—and competence—or their ability to carry out that objective. Depending on the warmth-competence...
15.4K
Stereotypes, Prejudice, and Discrimination02:55

Stereotypes, Prejudice, and Discrimination

95.0K
Humans are very diverse and although we share many similarities, we also have many differences. The social groups we belong to help form our identities (Tajfel, 1974). These differences may be difficult for some people to reconcile, which may lead to prejudice toward people who are different. Prejudice is a negative attitude and feeling toward an individual based solely on one’s membership in a particular social group (Allport, 1954; Brown, 2010). Prejudice is common against people who...
95.0K
Classification of Illness01:17

Classification of Illness

8.6K
The meaning of illness is individualized to each person who experiences an alteration in health. In contrast, disease is a medical term indicating a pathological change in the structure and function of the body or mind. It is a condition that has specific symptoms and boundaries.
An illness is a response to a disease in which the person's level of functioning is changed compared with a previous level. The general classification of illness includes acute and chronic.
Acute illness is severe...
8.6K
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies
  1. Home
  2. Evaluating Anti-lgbtqia+ Medical Bias In Large Language Models.
  1. Home
  2. Evaluating Anti-lgbtqia+ Medical Bias In Large Language Models.

Related Experiment Video

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Evaluating anti-LGBTQIA+ medical bias in large language models.

Crystal T Chang1, Neha Srivathsa2, Charbel Bou-Khalil3

  • 1Department of Dermatology, Stanford University, Stanford, California, United States of America.

PLOS Digital Health
|September 8, 2025

View abstract on PubMed

Summary
This summary is machine-generated.

This study found that Large Language Models (LLMs) exhibit anti-LGBTQIA+ bias and misinformation, with inappropriate responses occurring frequently. Further development is needed to improve accuracy and reduce bias for LGBTQIA+ patients.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

2.3K

Related Experiment Videos

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Constructing and Visualizing Models using Mime-based Machine-learning Framework
06:19

Constructing and Visualizing Models using Mime-based Machine-learning Framework

Published on: July 22, 2025

2.3K

Area of Science:

  • Artificial Intelligence in Healthcare
  • Medical Informatics
  • Health Equity

Background:

  • Large Language Models (LLMs) are increasingly used in clinical settings for patient communication and decision support.
  • Existing research highlights race-based and gender biases in LLMs, yet anti-LGBTQIA+ bias remains understudied.
  • Healthcare disparities disproportionately affect LGBTQIA+ individuals, underscoring the need to evaluate LLM bias in this population.

Purpose of the Study:

  • To evaluate the potential of four leading Large Language Models (LLMs) to propagate anti-LGBTQIA+ medical bias and misinformation.
  • To assess the appropriateness and clinical utility of LLM responses to prompts involving LGBTQIA+ identities.
  • To establish a benchmark dataset for evaluating future LLM performance in sensitive clinical contexts.

Main Methods:

  • Four LLMs (Gemini 1.5 Flash, Claude 3 Haiku, GPT-4o, Stanford Medicine Secure GPT) were prompted with 38 pairs of questions and synthetic clinical notes.
  • Prompts were designed to explore clinical situations with and without explicit LGBTQIA+ identity terms, across relevant and irrelevant clinical contexts.
  • Medically-trained reviewers and LGBTQIA+ health experts evaluated responses for safety, privacy, accuracy, bias, and clinical utility.

Main Results:

  • All four LLMs generated inappropriate responses for prompts both with and without LGBTQIA+ identity terms.
  • The proportion of inappropriate responses ranged from 43-62% for LGBTQIA+-specific prompts and 47-65% for general prompts.
  • Hallucination/accuracy was the most common reason for inappropriate classification, followed by bias and safety concerns, with LGBTQIA+ prompts eliciting more severe bias.

Conclusions:

  • LLMs demonstrate significant potential to propagate anti-LGBTQIA+ medical bias and misinformation.
  • Clinical utility scores were lower for inappropriate responses compared to appropriate ones.
  • Future research must focus on improving LLM accuracy, reducing bias, and tailoring outputs for LGBTQIA+ patient care.