Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
Multi-species Conserved Sequences02:51

Multi-species Conserved Sequences

Next-generation sequencing technologies have created large genomic databases of a variety of animals and plants. Ever since the human genome project was completed, scientists studied the genome of primates, mammals, and other phylogenetically distant living beings. Such large-scale  studies have provided new insights into the evolutionary relationship between organisms.
Although the genome of each species varies greatly from each other, a few sequences are highly conserved. Such conserved DNA...
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
Next-generation Sequencing03:00

Next-generation Sequencing

The first human genome sequencing project cost $2.7 billion and was declared complete in 2003, after 15 years of international cooperation and collaboration between several research teams and funding agencies. Today, with the advent of next-generation sequencing technologies, the cost and time of sequencing a human genome have dropped over 100 fold.
Next-Generation Sequencing Methods
Although all next-generation methods use different technologies, they all share a set of standard features.
Genetic Variation01:25

Genetic Variation

Genetic variation is the diversity in DNA sequences found among individuals of the same species. This diversity is crucial for a species' survival because it helps organisms adapt to environmental changes. Genetic variation begins with fertilization, where an egg and sperm cell merge. Each of these cells carries 23 chromosomes, up to 46 in the fertilized egg. Chromosomes are long DNA strands that contain genes, the basic units of heredity.
Genes exist in different versions called alleles, which...
Single Nucleotide Polymorphisms-SNPs01:05

Single Nucleotide Polymorphisms-SNPs

A single nucleotide polymorphism or SNP is a single nucleotide variation at a specific genomic position in a large population. It is the most prevalent type of sequence variation found in the human genome. Point mutations that occur in more than 1% of the population qualify as SNPs. These are present once every 1000 nucleotides on an average in the human genome. Replacement of a purine with another purine (A/G) or a pyrimidine with another pyrimidine (C/T) is known as a transition. In contrast,...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

H3K27me3 spreading organizes canonical PRC1 chromatin architecture to regulate developmental programs.

Nature genetics·2026
Same author

Telomere-to-telomere assembly detects genomic diversity in Canadian strains of Borrelia burgdorferi.

Cell reports·2026
Same author

A multivariate approach to identify association between peripheral blood DNA methylation and cerebrospinal fluid biomarkers of Alzheimer disease.

Scientific reports·2025
Same author

RobusTAD: reference panel based annotation of nested topologically associating domains.

Genome biology·2025
Same author

R3Design: deep tertiary structure-based RNA sequence design and beyond.

Briefings in bioinformatics·2024
Same author

ARGV: 3D genome structure exploration using augmented reality.

BMC bioinformatics·2024
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Jun 24, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

Graphylo Var: Predicting the impact of non-coding variants using a multi-species sequence model.

Dongjoon Lim1, Mathieu Blanchette1

  • 1School of Computer Science, McGill University, 3480 Rue University, Montreal, QC H3A 2A7, CANADA.

Bioinformatics (Oxford, England)
|June 22, 2026
PubMed
Summary
This summary is machine-generated.

GraphyloVar, a new deep learning model, predicts genetic variant effects by analyzing DNA sequences and evolutionary history across species. It outperforms existing methods, aiding precision medicine by identifying crucial non-coding variants.

Keywords:
DNA Foundation ModelPhylogeny-aware Machine LearningVariant effect prediction

More Related Videos

Screening for Functional Non-coding Genetic Variants Using Electrophoretic Mobility Shift Assay (EMSA) and DNA-affinity Precipitation Assay (DAPA)
11:35

Screening for Functional Non-coding Genetic Variants Using Electrophoretic Mobility Shift Assay (EMSA) and DNA-affinity Precipitation Assay (DAPA)

Published on: August 21, 2016

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

Related Experiment Videos

Last Updated: Jun 24, 2026

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons
08:04

Identification and Classification of Position-specific GABAA Receptor Subunit Missense Variants for Their Role In Hippocampal Pyramidal Neurons

Published on: June 6, 2025

Screening for Functional Non-coding Genetic Variants Using Electrophoretic Mobility Shift Assay (EMSA) and DNA-affinity Precipitation Assay (DAPA)
11:35

Screening for Functional Non-coding Genetic Variants Using Electrophoretic Mobility Shift Assay (EMSA) and DNA-affinity Precipitation Assay (DAPA)

Published on: August 21, 2016

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease
09:34

Targeted Next-generation Sequencing and Bioinformatics Pipeline to Evaluate Genetic Determinants of Constitutional Disease

Published on: April 4, 2018

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Predicting the functional impact of genetic variants is crucial for precision medicine.
  • Current tools like CADD, PhyloP, and PhastCons often analyze genomic positions in isolation, potentially missing evolutionary insights.
  • There is a need for models that leverage the evolutionary history connecting multiple species for more accurate variant effect prediction.

Purpose of the Study:

  • To introduce GraphyloVar, an advanced deep learning model designed to predict genetic variant effects.
  • To utilize the phylogenetic tree relating different species as direct input for variant effect prediction.
  • To improve upon existing methods by integrating evolutionary patterns with DNA sequence analysis.

Main Methods:

  • Developed GraphyloVar, a deep learning architecture combining Graph Convolutional Networks (GCNs) for phylogenetic tree processing and Transformer encoders for DNA sequence feature extraction.
  • Pre-trained the model on the TOPMed whole-genome sequencing cohort to predict population-level allele frequencies.
  • Evaluated GraphyloVar's zero-shot performance on held-out variants and its performance after fine-tuning on multiple MPRA benchmark datasets.

Main Results:

  • GraphyloVar achieved an Area Under the Receiver Operating Characteristic curve (AUROC) of 0.6246 zero-shot on approximately 149 million variants.
  • An ensemble of GraphyloVar with CADD improved the AUROC to 0.6442.
  • Fine-tuned GraphyloVar demonstrated superior performance, achieving the highest AUROC across all 13 MPRA benchmark datasets, highlighting its effectiveness in variant effect prediction.

Conclusions:

  • GraphyloVar provides a powerful and complementary approach to variant effect prediction by integrating deep learning with explicit phylogenetic information.
  • The model effectively utilizes the full evolutionary history from multiple species to better identify and prioritize important non-coding variants.
  • The findings suggest GraphyloVar can significantly advance precision medicine efforts by enhancing the understanding of genetic variant impacts.