Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Proteomics01:33

Proteomics

9.3K
A proteome is the entire set of proteins that a cell type produces. We can study proteomes using the knowledge of genomes because genes code for mRNAs, and the mRNAs encode proteins. Although mRNA analysis is a step in the right direction, not all mRNAs are translated into proteins.
Proteomics is the study of proteomes' function. It involves the large-scale systematic study of the proteome to denote the protein complement expressed by a genome. Scientist Mark Wilkins coined the term...
9.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Type-2-Inflammatory-Diseases Share Comorbidities, Molecular Signatures, IL4/IL13 Genetics, and Response to IL4/IL13 Blockade.

Allergy·2026
Same author

Polygenic risk score and 20-year prostate cancer-specific mortality and survival.

Communications medicine·2026
Same author

Population-scale repeat expansions elucidate disease risk and brain atrophy.

Nature·2026
Same author

Humans with function-disrupting variants in the myostatin gene (MSTN) have increased skeletal muscle mass and strength, and less adiposity.

Nature communications·2026
Same author

Rare coding variants in CHRNB3 associate with reduced daily cigarette smoking across ancestries.

Nature communications·2026
Same author

Genome and Transcriptome-Wide Analyses Identify Multiple Candidate Genes and a Significant Polygenic Contribution in Bicuspid Aortic Valve.

Circulation·2026
Same journal

Applying Bayesian Multivariable Mendelian Randomisation to Prioritise Candidate Causal Traits From High-Dimensional Data: Illustration From Estimation of the Effect of Maternal Metabolites on Offspring Birthweight.

Genetic epidemiology·2026
Same journal

Individualized Bayesian Inference Identifies Novel Genetic Variants for Parkinson's Disease.

Genetic epidemiology·2026
Same journal

DRIVE v3: Command Line Application for Identity-by-Descent Haplotype Clustering in Large Biobank Scale Data.

Genetic epidemiology·2026
Same journal

Deep Unsupervised Domain Adaptation for Translating Cancer Dependency Maps From Cell Lines to Breast Cancer Tumor Genomics.

Genetic epidemiology·2026
Same journal

Polygenic Risk Scores for Incident Dementia in the Multi-Ethnic Study of Atherosclerosis.

Genetic epidemiology·2026
Same journal

Outcome and Exposure Polygenic Risk Scores Can Help Reduce Information Bias and Selection Bias in Regression Estimates From Biobank Data.

Genetic epidemiology·2026
See all related articles

Related Experiment Video

Updated: Jan 12, 2026

Determining the Likelihood of Variant Pathogenicity Using Amino Acid-level Signal-to-Noise Analysis of Genetic Variation
07:15

Determining the Likelihood of Variant Pathogenicity Using Amino Acid-level Signal-to-Noise Analysis of Genetic Variation

Published on: January 16, 2019

11.3K

Variant Classification Using Proteomics-Informed Large Language Models Increases Power of Rare Variant Association

Christopher E Gillies1, Joelle Mbatchou1, Lukas Habegger1

  • 1Regeneron Genetics Center, Tarrytown, New York, USA.

Genetic Epidemiology
|November 3, 2025
PubMed
Summary
This summary is machine-generated.

Large language models (LLMs) for predicting damaging genetic variants are improved using proteomics data. This refined approach enhances the discovery of gene-trait associations in human genetic studies.

More Related Videos

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

1.3K
In Vivo Functional Study of Disease-associated Rare Human Variants Using Drosophila
06:41

In Vivo Functional Study of Disease-associated Rare Human Variants Using Drosophila

Published on: August 20, 2019

14.2K

Related Experiment Videos

Last Updated: Jan 12, 2026

Determining the Likelihood of Variant Pathogenicity Using Amino Acid-level Signal-to-Noise Analysis of Genetic Variation
07:15

Determining the Likelihood of Variant Pathogenicity Using Amino Acid-level Signal-to-Noise Analysis of Genetic Variation

Published on: January 16, 2019

11.3K
Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

1.3K
In Vivo Functional Study of Disease-associated Rare Human Variants Using Drosophila
06:41

In Vivo Functional Study of Disease-associated Rare Human Variants Using Drosophila

Published on: August 20, 2019

14.2K

Area of Science:

  • Genetics
  • Bioinformatics
  • Proteomics

Background:

  • Rare variant association analysis is crucial for understanding human biology.
  • Predicting the impact of genetic variants is essential for this analysis.
  • Deep learning and large language models (LLMs) show promise in variant impact prediction.

Purpose of the Study:

  • To evaluate and refine LLM predictors of damaging genetic variants using proteomics data.
  • To assess the performance of a proteomics-guided LLM in identifying gene-trait associations.

Main Methods:

  • Utilized proteomics data from 46,665 individuals across 2898 proteins.
  • Developed and refined LLM-based variant predictors.
  • Evaluated model performance on 241 positive control gene-trait pairs and 10 UK Biobank traits.

Main Results:

  • The proteomics-guided LLM outperformed conventional and machine learning methods in identifying damaging missense variants.
  • The model recapitulated 36.5% of known gene-trait associations, exceeding alternatives.
  • 177 novel gene-trait associations were identified using the model on UK Biobank data.

Conclusions:

  • Proteomics data can effectively refine LLM variant classification.
  • This approach significantly improves the discovery potential of human genetic studies.