Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

15.3K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
15.3K
What is Population Genetics?01:25

What is Population Genetics?

64.4K
A population is composed of members of the same species that simultaneously live and interact in the same area. When individuals in a population breed, they pass down their genes to their offspring. Many of these genes are polymorphic, meaning that they occur in multiple variants. Such variations of a gene are referred to as alleles. The collective set of all the alleles within a population is known as the gene pool.
64.4K
Single Nucleotide Polymorphisms-SNPs01:05

Single Nucleotide Polymorphisms-SNPs

17.9K
A single nucleotide polymorphism or SNP is a single nucleotide variation at a specific genomic position in a large population. It is the most prevalent type of sequence variation found in the human genome. Point mutations that occur in more than 1% of the population qualify as SNPs. These are present once every 1000 nucleotides on an average in the human genome. Replacement of a purine with another purine (A/G) or a pyrimidine with another pyrimidine (C/T) is known as a transition. In contrast,...
17.9K
Analysis of Population Pharmacokinetic Data01:12

Analysis of Population Pharmacokinetic Data

677
Analysis of population pharmacokinetic data involves studying the behavior of drugs within diverse populations to understand their pharmacokinetic parameters. Traditional pharmacokinetic methods typically involve collecting samples from a few individuals and estimating these parameters. While these methods are commonly used, they have limitations in capturing the variability in drug response among individuals or heterogeneous populations. Population pharmacokinetics is employed to address these...
677
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

18.6K
Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
18.6K
Mutation, Gene Flow, and Genetic Drift01:09

Mutation, Gene Flow, and Genetic Drift

61.7K
In a population that is not at Hardy-Weinberg equilibrium, the frequency of alleles changes over time. Therefore, any deviations from the five conditions of Hardy-Weinberg equilibrium can alter the genetic variation of a given population. Conditions that change the genetic variability of a population include mutations, natural selection, non-random mating, gene flow, and genetic drift (small population size).
61.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Transcription start sites experience a high influx of heritable variants fueled by early development.

Nature communications·2025
Same author

GHIST 2024: The First Genomic History Inference Strategies Tournament.

Molecular biology and evolution·2025
Same author

Accessible, Realistic Genome Simulation with Selection Using stdpopsim.

Molecular biology and evolution·2025
Same author

Bistable Mutation-Selection Equilibria and Violations of Fisher's Theorem in Tetraploids: Insights from Nonlinear Dynamics.

bioRxiv : the preprint server for biology·2025
Same author

GHIST 2024: The 1st Genomic History Inference Strategies Tournament.

bioRxiv : the preprint server for biology·2025
Same author

Detection of domestication signals through the analysis of the full distribution of fitness effects.

Peer community journal·2025
Same journal

The life history of recessive deleterious alleles as seen through the eyes of a honey bee (Apis mellifera).

Molecular biology and evolution·2026
Same journal

Severe bottleneck of ancient Homo populations: Insights from computational modeling and relevant fossil evidence.

Molecular biology and evolution·2026
Same journal

Population Epigenetics: Deciphering DNA Methylation Diversity and its Implications for Health, Disease, and Evolution.

Molecular biology and evolution·2026
Same journal

Genomic signature of repeated transitions to diurnality in spiders.

Molecular biology and evolution·2026
Same journal

Phylogenomic blind spots: The limits of UCE and BUSCO loci in the presence of gene flow.

Molecular biology and evolution·2026
Same journal

seqLens: Optimizing Language Models for Genomic Predictions.

Molecular biology and evolution·2026
See all related articles

Related Experiment Video

Updated: Jan 15, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.7K

Interpreting Supervised Machine Learning Inferences in Population Genomics Using Haplotype Matrix Permutations.

Linh N Tran1,2, David Castellano2, Ryan N Gutenkunst2

  • 1Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, AZ 85721, USA.

Molecular Biology and Evolution
|October 6, 2025
PubMed
Summary
This summary is machine-generated.

We developed a permutation method to interpret population genomics machine learning models. This approach reveals that some models rely on specific genetic features like haplotype structure, while others use simpler data.

More Related Videos

Infinium Assay for Large-scale SNP Genotyping Applications
13:33

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

39.8K
Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3
11:10

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Published on: December 27, 2010

12.7K

Related Experiment Videos

Last Updated: Jan 15, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.7K
Infinium Assay for Large-scale SNP Genotyping Applications
13:33

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

39.8K
Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3
11:10

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Published on: December 27, 2010

12.7K

Area of Science:

  • Population genomics
  • Machine learning
  • Bioinformatics

Background:

  • Supervised machine learning, particularly convolutional neural networks (CNNs), are increasingly used for population genomics inference.
  • A major limitation of these methods is their lack of interpretability, hindering biological insights and method development.

Purpose of the Study:

  • To develop a systematic and interpretable framework for understanding the population genetics features driving machine learning model predictions.
  • To assess the feature importance for existing CNNs used in population genomics.

Main Methods:

  • Introduced a permutation-based approach to progressively disrupt population genetics features (linkage disequilibrium, haplotype structure, allele frequencies) in haplotype matrices.
  • Measured performance degradation of CNNs after feature disruption to quantify feature importance.
  • Applied the method to three published CNNs for positive selection and demographic history inference.

Main Results:

  • The ImaGene CNN for positive selection heavily relies on haplotype structure and linkage disequilibrium.
  • A demographic inference CNN primarily uses allele frequency information.
  • The disc-pg-gan CNN achieved high accuracy with only allele counts, suggesting potential limitations in its learned features.

Conclusions:

  • The developed permutation approach is a model-agnostic, biologically-motivated framework for interpreting haplotype matrix-based methods.
  • Provides crucial insights into feature importance, guiding future method development and application in population genomics.
  • Highlights variability in how different CNNs utilize population genetic information.