Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

2.5K
2.5K
lncRNA - Long Non-coding RNAs02:39

lncRNA - Long Non-coding RNAs

2.9K
2.9K
Gene Families01:57

Gene Families

2.6K
2.6K
Language01:16

Language

250
Language is a unique communication system that uses words and systematic rules to organize and transmit information. Unlike other forms of communication, which may involve postures, movements, odors, or vocalizations, language relies on symbols and grammar. This makes human communication distinct from that of other species, who also communicate but do not use language in the same way humans do.
Corballis and Suddendorf (2007) and Tomasello and Rakoczy (2003) highlight the role of language in...
250
mRNA Stability and Gene Expression02:51

mRNA Stability and Gene Expression

2.9K
2.9K
Improving Translational Accuracy02:07

Improving Translational Accuracy

2.6K
2.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A pragmatist approach to bridging tables and ontologies through LinkML and punning.

Journal of biomedical semantics·2026
Same author

VO: The Vaccine Ontology.

Scientific data·2026
Same author

The Cell Ontology in the age of single-cell omics.

Scientific data·2026
Same author

Representing dental caries and dysbiosis within the oral microbiome in the Oral Health and Disease Ontology.

Journal of biomedical semantics·2026
Same author

OpenScientist: evaluating an open agentic AI co-scientist to accelerate biomedical discovery.

medRxiv : the preprint server for health sciences·2026
Same author

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools.

European journal of human genetics : EJHG·2026
Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026
Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026
Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026
Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026
Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026
Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026
See all related articles

Related Experiment Video

Updated: Jul 27, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

632

Gene Set Summarization Using Large Language Models.

Marcin P Joachimiak1, J Harry Caufield1, Nomi L Harris1

  • 1Biosystems Data Science Department, Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA.

Arxiv
|June 9, 2023
PubMed
Summary
This summary is machine-generated.

Large Language Models (LLMs) can summarize gene functions but cannot replace statistical enrichment analysis for gene lists. Newer LLM models show promise for future biological data interpretation.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

348
A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

17.9K

Related Experiment Videos

Last Updated: Jul 27, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

632
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

348
A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

17.9K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Molecular biologists analyze gene lists from high-throughput experiments using statistical enrichment analysis.
  • Gene Ontology (GO) is a key knowledge base for annotating gene functions.
  • Interpreting gene lists can be approached as a textual summarization task.

Purpose of the Study:

  • To evaluate Large Language Models (LLMs) for gene set function summarization as a complement to standard enrichment analysis.
  • To assess LLM performance using curated annotations, narrative summaries, or direct model retrieval.
  • To compare LLM-based summarization with traditional statistical methods for gene list interpretation.

Main Methods:

  • TALISMAN (Terminological ArtificiaL Intelligence SuMmarization of Annotation and Narratives) was developed using generative AI.
  • LLMs were tested with different data sources: ontological annotations, narrative summaries, and direct retrieval.
  • Performance was evaluated based on the plausibility, biological validity, precision, and recall of generated GO term lists.

Main Results:

  • LLM-based methods generated plausible and biologically valid GO term lists.
  • LLMs could not provide reliable statistical scores (p-values) and often returned non-significant terms.
  • LLMs rarely recapitulated the most precise terms identified by standard enrichment analysis.
  • Newer LLM models demonstrated statistically significant improvements over older models.
  • Prompt variations led to non-deterministic and radically different term lists.

Conclusions:

  • LLM-based methods are currently unsuitable as a replacement for standard gene enrichment analysis.
  • LLMs may offer summarization benefits for integrating implicit knowledge and processing large, complex gene sets.
  • Future advancements in LLM technology may enhance their utility in biological data interpretation.