Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genome Size and the Evolution of New Genes

Genome Size and the Evolution of New Genes

lncRNA - Long Non-coding RNAs

lncRNA - Long Non-coding RNAs

Gene Families

Gene Families

Language

Language

Language is a unique communication system that uses words and systematic rules to organize and transmit information. Unlike other forms of communication, which may involve postures, movements, odors, or vocalizations, language relies on symbols and grammar. This makes human communication distinct from that of other species, who also communicate but do not use language in the same way humans do.
Corballis and Suddendorf (2007) and Tomasello and Rakoczy (2003) highlight the role of language in...

mRNA Stability and Gene Expression

mRNA Stability and Gene Expression

Improving Translational Accuracy

Improving Translational Accuracy

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A pragmatist approach to bridging tables and ontologies through LinkML and punning.

Journal of biomedical semantics·2026

Same author

VO: The Vaccine Ontology.

Scientific data·2026

Same author

The Cell Ontology in the age of single-cell omics.

Scientific data·2026

Same author

Representing dental caries and dysbiosis within the oral microbiome in the Oral Health and Disease Ontology.

Journal of biomedical semantics·2026

Same author

OpenScientist: evaluating an open agentic AI co-scientist to accelerate biomedical discovery.

medRxiv : the preprint server for health sciences·2026

Same author

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools.

European journal of human genetics : EJHG·2026

Same journal

Poisoning the Genome: Targeted Backdoor Attacks on DNA Foundation Models.

ArXiv·2026

Same journal

Mechanistic mathematical model of the in vitro infection dynamics of Bunyamwera and Batai viruses including MOI-dependent shortening of the eclipse phase.

ArXiv·2026

Same journal

AI-Driven Lumped-Element Modeling of Human Respiratory System for Studying Voice Mechanics.

ArXiv·2026

Same journal

Beyond Algorithms: Conceptual Innovation in Medical Imaging AI.

ArXiv·2026

Same journal

Feynman Kac Reweighted Schrödinger Bridge Matching for Surface-Based Tau PET Harmonization.

ArXiv·2026

Same journal

Agentic Discovery of Non-Canonical Antimicrobial Peptides with AMPGAN v3.

ArXiv·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 27, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Gene Set Summarization Using Large Language Models.

Marcin P Joachimiak¹, J Harry Caufield¹, Nomi L Harris¹

¹Biosystems Data Science Department, Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, USA.

|June 9, 2023

Summary

This summary is machine-generated.

Large Language Models (LLMs) can summarize gene functions but cannot replace statistical enrichment analysis for gene lists. Newer LLM models show promise for future biological data interpretation.

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

Related Experiment Videos

Last Updated: Jul 27, 2025

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Molecular biologists analyze gene lists from high-throughput experiments using statistical enrichment analysis.
Gene Ontology (GO) is a key knowledge base for annotating gene functions.
Interpreting gene lists can be approached as a textual summarization task.

Purpose of the Study:

To evaluate Large Language Models (LLMs) for gene set function summarization as a complement to standard enrichment analysis.
To assess LLM performance using curated annotations, narrative summaries, or direct model retrieval.
To compare LLM-based summarization with traditional statistical methods for gene list interpretation.

Main Methods:

TALISMAN (Terminological ArtificiaL Intelligence SuMmarization of Annotation and Narratives) was developed using generative AI.
LLMs were tested with different data sources: ontological annotations, narrative summaries, and direct retrieval.
Performance was evaluated based on the plausibility, biological validity, precision, and recall of generated GO term lists.

Main Results:

LLM-based methods generated plausible and biologically valid GO term lists.
LLMs could not provide reliable statistical scores (p-values) and often returned non-significant terms.
LLMs rarely recapitulated the most precise terms identified by standard enrichment analysis.
Newer LLM models demonstrated statistically significant improvements over older models.
Prompt variations led to non-deterministic and radically different term lists.

Conclusions:

LLM-based methods are currently unsuitable as a replacement for standard gene enrichment analysis.
LLMs may offer summarization benefits for integrating implicit knowledge and processing large, complex gene sets.
Future advancements in LLM technology may enhance their utility in biological data interpretation.