Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.9K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.9K
Genomics02:02

Genomics

39.7K
Genomics is the science of genomes: it is the study of all the genetic material of an organism. In humans, the genome consists of information carried in 23 pairs of chromosomes in the nucleus, as well as mitochondrial DNA. In genomics, both coding and non-coding DNA is sequenced and analyzed. Genomics allows a better understanding of all living things, their evolution, and their diversity. It has a myriad of uses: for example, to build phylogenetic trees, to improve productivity and...
39.7K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

3.3K
3.3K
Genome Size and the Evolution of New Genes03:21

Genome Size and the Evolution of New Genes

9.0K
While every living organism has a genome of some kind (be it RNA, or DNA), there is considerable variation in the sizes of these blueprints. One major factor that impacts genome size is whether the organism is prokaryotic or eukaryotic. In prokaryotes, the genome contains little to no non-coding sequence, such that genes are tightly clustered in groups or operons sequentially along the chromosome. Conversely, the genes in eukaryotes are punctuated by long stretches of non-coding sequence.
9.0K
Genome Annotation and Assembly03:36

Genome Annotation and Assembly

20.5K
The genome refers to all of the genetic material in an organism. It can range from a few million base pairs in microbial cells to several billion base pairs in many eukaryotic organisms. Genome assembly refers to the process of taking the DNA sequencing data and putting it all back together in a correct order to create a close representation of the original genome. This is followed by the identification of functional elements on the newly assembled genome, a process called genome annotation.
20.5K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

15.3K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
15.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Differences in hepatocyte-related indicators within occupational hazardous factor exposure between genders.

Frontiers in public health·2026
Same author

Lense: optimizing data preprocessing in single-cell omics using large language models.

Briefings in bioinformatics·2026
Same author

CARES-Net: a channel-attention residual network for multi-disease classification in small-sample <sup>1</sup>H NMR metabolomics data.

Analytica chimica acta·2026
Same author

Lense: Optimizing data preprocessing in single-cell omics using LLMs.

bioRxiv : the preprint server for biology·2026
Same author

High-flow Nasal Therapy vs Noninvasive Ventilation for Post-extubation Patients at High Risk of Reintubation: A Systematic Review and Meta-analysis of Randomized Controlled Trials.

Archivos de bronconeumologia·2026
Same author

Circulating extracellular microRNAs as tissue-specific biomarkers of human health and disease.

Nature communications·2026
Same journal

STED: flexible cross-modal topic modeling infers cell-type-specific regulatory landscapes from bulk epigenomics.

Briefings in bioinformatics·2026
Same journal

A knowledge-guided deep learning framework for quantitative nucleic acid testing.

Briefings in bioinformatics·2026
Same journal

Optimal transport for label transfer in single-cell multi-omics integration.

Briefings in bioinformatics·2026
Same journal

Continuous multi-omics pathway enrichment analysis resolves hidden functional heterogeneity.

Briefings in bioinformatics·2026
Same journal

Evaluating completeness, coherence, and consistency of genome-scale function annotations.

Briefings in bioinformatics·2026
Same journal

Transformers for single-cell RNA sequencing: a survey.

Briefings in bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Jan 17, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K

Benchmarking large language models for genomic knowledge with GeneTuring.

Xinyi Shang1, Xu Liao1, Zhicheng Ji2

  • 1Department of Biostatistics, Mailman School of Public Health, Columbia University, 722 West 168th Street, New York, NY 10032, United States.

Briefings in Bioinformatics
|September 23, 2025
PubMed
Summary
This summary is machine-generated.

We created GeneTuring, a genomics benchmark, to evaluate large language models (LLMs). SeqSnap, a custom GPT-4o tool, performed best, showing LLMs

Keywords:
benchmarkgenomicsknowledge baselarge language model

More Related Videos

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.7K

Related Experiment Videos

Last Updated: Jan 17, 2026

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness
03:14

Augmenting Large Language Models via Vector Embeddings to Improve Domain-Specific Responsiveness

Published on: December 6, 2024

1.0K
Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems
05:47

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems

Published on: June 13, 2025

1.3K
Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.7K

Area of Science:

  • Genomics
  • Bioinformatics
  • Artificial Intelligence

Background:

  • Large language models (LLMs) show potential in biomedical research.
  • The effectiveness of LLMs for genomic inquiry is not well-established.
  • A need exists for standardized evaluation of LLMs in genomics.

Purpose of the Study:

  • To develop a comprehensive benchmark for evaluating LLMs in genomics.
  • To assess the performance of various LLM configurations on genomics tasks.
  • To identify optimal strategies for integrating LLMs into genomic research.

Main Methods:

  • Creation of GeneTuring, a benchmark with 16 genomics tasks and 1600 curated questions.
  • Manual evaluation of 48,000 answers from 10 LLM configurations.
  • Development of SeqSnap, a custom GPT-4o configuration using NCBI APIs.

Main Results:

  • SeqSnap achieved the best overall performance among evaluated LLMs.
  • GPT-4o with web access and GeneGPT showed complementary strengths.
  • LLMs demonstrate both promise and limitations in current genomic applications.

Conclusions:

  • GeneTuring provides a valuable resource for benchmarking and advancing LLMs in genomics.
  • Integrating LLMs with domain-specific tools like SeqSnap enhances genomic intelligence.
  • Further research is needed to overcome current LLM limitations in complex genomic tasks.