Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Genome-wide Association Studies-GWAS

Genome-wide Association Studies-GWAS

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

What is Population Genetics?

What is Population Genetics?

A population is composed of members of the same species that simultaneously live and interact in the same area. When individuals in a population breed, they pass down their genes to their offspring. Many of these genes are polymorphic, meaning that they occur in multiple variants. Such variations of a gene are referred to as alleles. The collective set of all the alleles within a population is known as the gene pool.

Single Nucleotide Polymorphisms-SNPs

Single Nucleotide Polymorphisms-SNPs

A single nucleotide polymorphism or SNP is a single nucleotide variation at a specific genomic position in a large population. It is the most prevalent type of sequence variation found in the human genome. Point mutations that occur in more than 1% of the population qualify as SNPs. These are present once every 1000 nucleotides on an average in the human genome. Replacement of a purine with another purine (A/G) or a pyrimidine with another pyrimidine (C/T) is known as a transition. In contrast,...

Analysis of Population Pharmacokinetic Data

Analysis of Population Pharmacokinetic Data

Analysis of population pharmacokinetic data involves studying the behavior of drugs within diverse populations to understand their pharmacokinetic parameters. Traditional pharmacokinetic methods typically involve collecting samples from a few individuals and estimating these parameters. While these methods are commonly used, they have limitations in capturing the variability in drug response among individuals or heterogeneous populations. Population pharmacokinetics is employed to address these...

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Mutation, Gene Flow, and Genetic Drift

Mutation, Gene Flow, and Genetic Drift

In a population that is not at Hardy-Weinberg equilibrium, the frequency of alleles changes over time. Therefore, any deviations from the five conditions of Hardy-Weinberg equilibrium can alter the genetic variation of a given population. Conditions that change the genetic variability of a population include mutations, natural selection, non-random mating, gene flow, and genetic drift (small population size).

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Transcription start sites experience a high influx of heritable variants fueled by early development.

Nature communications·2025

Same author

GHIST 2024: The First Genomic History Inference Strategies Tournament.

Molecular biology and evolution·2025

Same author

Accessible, Realistic Genome Simulation with Selection Using stdpopsim.

Molecular biology and evolution·2025

Same author

Bistable Mutation-Selection Equilibria and Violations of Fisher's Theorem in Tetraploids: Insights from Nonlinear Dynamics.

bioRxiv : the preprint server for biology·2025

Same author

GHIST 2024: The 1st Genomic History Inference Strategies Tournament.

bioRxiv : the preprint server for biology·2025

Same author

Detection of domestication signals through the analysis of the full distribution of fitness effects.

Peer community journal·2025

Same journal

The life history of recessive deleterious alleles as seen through the eyes of a honey bee (Apis mellifera).

Molecular biology and evolution·2026

Same journal

Severe bottleneck of ancient Homo populations: Insights from computational modeling and relevant fossil evidence.

Molecular biology and evolution·2026

Same journal

Population Epigenetics: Deciphering DNA Methylation Diversity and its Implications for Health, Disease, and Evolution.

Molecular biology and evolution·2026

Same journal

Genomic signature of repeated transitions to diurnality in spiders.

Molecular biology and evolution·2026

Same journal

Phylogenomic blind spots: The limits of UCE and BUSCO loci in the presence of gene flow.

Molecular biology and evolution·2026

Same journal

seqLens: Optimizing Language Models for Genomic Predictions.

Molecular biology and evolution·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jan 15, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Interpreting Supervised Machine Learning Inferences in Population Genomics Using Haplotype Matrix Permutations.

Linh N Tran^1,2, David Castellano², Ryan N Gutenkunst²

¹Genetics Graduate Interdisciplinary Program, University of Arizona, Tucson, AZ 85721, USA.

Molecular Biology and Evolution

|October 6, 2025

Summary

This summary is machine-generated.

We developed a permutation method to interpret population genomics machine learning models. This approach reveals that some models rely on specific genetic features like haplotype structure, while others use simpler data.

More Related Videos

Infinium Assay for Large-scale SNP Genotyping Applications

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Published on: December 27, 2010

Related Experiment Videos

Last Updated: Jan 15, 2026

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Infinium Assay for Large-scale SNP Genotyping Applications

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Genotypic Inference of HIV-1 Tropism Using Population-based Sequencing of V3

Published on: December 27, 2010

Area of Science:

Population genomics
Machine learning
Bioinformatics

Background:

Supervised machine learning, particularly convolutional neural networks (CNNs), are increasingly used for population genomics inference.
A major limitation of these methods is their lack of interpretability, hindering biological insights and method development.

Purpose of the Study:

To develop a systematic and interpretable framework for understanding the population genetics features driving machine learning model predictions.
To assess the feature importance for existing CNNs used in population genomics.

Main Methods:

Introduced a permutation-based approach to progressively disrupt population genetics features (linkage disequilibrium, haplotype structure, allele frequencies) in haplotype matrices.
Measured performance degradation of CNNs after feature disruption to quantify feature importance.
Applied the method to three published CNNs for positive selection and demographic history inference.

Main Results:

The ImaGene CNN for positive selection heavily relies on haplotype structure and linkage disequilibrium.
A demographic inference CNN primarily uses allele frequency information.
The disc-pg-gan CNN achieved high accuracy with only allele counts, suggesting potential limitations in its learned features.

Conclusions:

The developed permutation approach is a model-agnostic, biologically-motivated framework for interpreting haplotype matrix-based methods.
Provides crucial insights into feature importance, guiding future method development and application in population genomics.
Highlights variability in how different CNNs utilize population genetic information.