Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

16.0K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
16.0K
Polygenic Traits01:18

Polygenic Traits

69.5K
When more than one gene is responsible for a given phenotype, the trait is considered polygenic. Human height is a polygenic trait. Studies have uncovered hundreds of loci that influence height, and there are believed to be many more. Due to the high number of genes involved, as well as environmental and nutritional factors, height varies significantly within a given population. The distribution of height forms a bell-shaped curve, with relatively few individuals in the population at the...
69.5K
Multiple Allele Traits01:49

Multiple Allele Traits

38.4K
The Concept of Multiple Allelism
38.4K
Pharmacogenomics: Identification of New Drug Targets01:29

Pharmacogenomics: Identification of New Drug Targets

53
Advances in genomics have profoundly influenced drug discovery by increasing both the speed and accuracy of pharmaceutical development. Pharmacogenomics, which examines how genetic variation influences drug response, facilitates the identification of novel therapeutic targets and enables patient stratification for personalized treatment. These strategies contribute to improved drug efficacy, minimized adverse effects, and more efficient clinical trial design.Mapping genetic differences...
53
Pleiotropy01:33

Pleiotropy

43.6K
Pleiotropy is the phenomenon in which a single gene impacts multiple, seemingly unrelated phenotypic traits. For example, defects in the SOX10 gene cause Waardenburg Syndrome Type 4, or WS4, which can cause defects in pigmentation, hearing impairments, and an absence of intestinal contractions necessary for elimination. This diversity of phenotypes results from the expression pattern of SOX10 in early embryonic and fetal development. SOX10 is found in neural crest cells that form melanocytes,...
43.6K
Incomplete Dominance01:43

Incomplete Dominance

30.7K
Gregor Mendel's work (1822 - 1884) was primarily focused on pea plants. Through his initial experiments, he determined that every gene in a diploid cell has two variants called alleles inherited from each parent. He suggested that amongst these two alleles, one allele is dominant in character and the other recessive. The combination of alleles determines the phenotype of a gene in an organism.
30.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Deep peptide recognition profiling decodes TCR specificity and enables disease-associated antigen discovery.

Nature biotechnology·2026
Same author

Genetic predictors of GLP1 receptor agonist weight loss and side effects.

Nature·2026
Same author

Hybrid Computer Vision Model to Predict Lung Cancer in Diverse Populations.

JCO clinical cancer informatics·2026
Same author

Integrating Polygenic Risk Improves Generative Forecasting of Disease Trajectories.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same author

Virtual Cells Need Context, Not Just Scale.

bioRxiv : the preprint server for biology·2026
Same author

DecoderTCR: Compositional Pretraining and Entropy-Guided Decoding for TCR-pMHC Interactions.

bioRxiv : the preprint server for biology·2026
Same journal

Trust, Reproducibility, and Progress: The Roles of Independent Blind Prediction and Assessment and Benchmarking in Computational Biology.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

The Evolving Cyberinfrastructure at the National Institutes of Health to Support Data and AI in Biomedical Research.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

Applications of AI & ML in Biomanufacturing of Cell and Gene Therapies.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

AI for Health: Leveraging Artificial Intelligence to Revolutionize Healthcare.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

Workshop Introduction: Advances of AI Methods in Single Cell Spatial Omics.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
Same journal

DRIVE-KG: Enhancing variant-phenotype association discovery in understudied complex diseases using heterogeneous knowledge graphs.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2026
See all related articles

Related Experiment Video

Updated: Feb 28, 2026

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

4.9K

Large language models identify causal genes in complex trait GWAS.

Suyash S Shringarpure1, Wei Wang2, Sotiris Karagounis2

  • 123andMe Inc., Palo Alto, CA, USA, suyashss@gmail.com.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing
|February 27, 2026
PubMed
Summary
This summary is machine-generated.

Large language models (LLMs) accurately identify causal genes at genome-wide association study (GWAS) loci. These models offer a scalable and generalizable approach to accelerate genetic discovery for complex traits.

More Related Videos

Mapping Alzheimer's Disease Variants to Their Target Genes Using Computational Analysis of Chromatin Configuration
04:41

Mapping Alzheimer's Disease Variants to Their Target Genes Using Computational Analysis of Chromatin Configuration

Published on: January 9, 2020

19.5K
In Vivo Modeling of the Morbid Human Genome using Danio rerio
12:31

In Vivo Modeling of the Morbid Human Genome using Danio rerio

Published on: August 24, 2013

21.4K

Related Experiment Videos

Last Updated: Feb 28, 2026

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization
08:27

Large-Scale Multi-Omics Genome-Wide Association Studies Mo-GWAS: Guidelines for Sample Preparation and Normalization

Published on: July 27, 2021

4.9K
Mapping Alzheimer's Disease Variants to Their Target Genes Using Computational Analysis of Chromatin Configuration
04:41

Mapping Alzheimer's Disease Variants to Their Target Genes Using Computational Analysis of Chromatin Configuration

Published on: January 9, 2020

19.5K
In Vivo Modeling of the Morbid Human Genome using Danio rerio
12:31

In Vivo Modeling of the Morbid Human Genome using Danio rerio

Published on: August 24, 2013

21.4K

Area of Science:

  • Genetics
  • Bioinformatics
  • Computational Biology

Background:

  • Identifying causal genes at genome-wide association study (GWAS) loci is crucial for understanding complex traits but remains a significant challenge.
  • Current literature-mining methods often lack the accuracy and scalability needed for comprehensive genetic analysis.

Purpose of the Study:

  • To evaluate the effectiveness of large language models (LLMs) in prioritizing likely causal genes at GWAS loci.
  • To compare LLM performance against existing state-of-the-art methods and assess their generalizability to novel loci.

Main Methods:

  • Systematic evaluation of general-purpose LLMs using benchmark datasets of high-confidence causal genes.
  • Inclusion of a unique dataset from 23 unpublished GWAS to test performance on novel loci.
  • Assessment of LLM performance when integrated with existing genetic analysis methods.

Main Results:

  • LLMs demonstrated high accuracy in prioritizing causal genes at GWAS loci, outperforming or matching current state-of-the-art methods.
  • LLMs showed robust performance on novel loci, indicating strong generalizability.
  • Integrating LLMs with existing methods significantly enhanced overall causal gene identification performance.

Conclusions:

  • LLMs provide an accurate, scalable, and generalizable approach for causal gene identification in GWAS.
  • This work establishes LLMs as a powerful tool to accelerate the discovery of genes underlying complex traits.
  • LLMs represent a significant advancement in leveraging artificial intelligence for genetic research.