Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Incomplete Dominance01:43

Incomplete Dominance

24.1K
Gregor Mendel's work (1822 - 1884) was primarily focused on pea plants. Through his initial experiments, he determined that every gene in a diploid cell has two variants called alleles inherited from each parent. He suggested that amongst these two alleles, one allele is dominant in character and the other recessive. The combination of alleles determines the phenotype of a gene in an organism.
24.1K
Multiple Allele Traits01:49

Multiple Allele Traits

34.4K
The Concept of Multiple Allelism
34.4K
Evolutionary Relationships through Genome Comparisons02:54

Evolutionary Relationships through Genome Comparisons

6.0K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
6.0K
Polygenic Traits01:18

Polygenic Traits

66.1K
When more than one gene is responsible for a given phenotype, the trait is considered polygenic. Human height is a polygenic trait. Studies have uncovered hundreds of loci that influence height, and there are believed to be many more. Due to the high number of genes involved, as well as environmental and nutritional factors, height varies significantly within a given population. The distribution of height forms a bell-shaped curve, with relatively few individuals in the population at the...
66.1K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

13.7K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
13.7K
Genetic Variation01:25

Genetic Variation

340
Genetic variation is the diversity in DNA sequences found among individuals of the same species. This diversity is crucial for a species' survival because it helps organisms adapt to environmental changes. Genetic variation begins with fertilization, where an egg and sperm cell merge. Each of these cells carries 23 chromosomes, up to 46 in the fertilized egg. Chromosomes are long DNA strands that contain genes, the basic units of heredity.
Genes exist in different versions called alleles,...
340

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A rapid review of genetic association studies of parent-of-origin effects and fetal growth.

Molecular and cellular pediatrics·2026
Same author

Identification of multi-omic pleiotropy factors for peripheral artery disease.

medRxiv : the preprint server for health sciences·2025
Same author

Symptoms of the suicide crisis syndrome and associated risk factors in an acute psychiatric population, a cross-sectional study.

European psychiatry : the journal of the Association of European Psychiatrists·2025
Same author

Sex-specific cardiovascular disease risk prediction using statistical learning and explainable artificial intelligence: the HUNT Study.

European journal of preventive cardiology·2025
Same author

Exploring associations between the FTO rs9939609 genotype and plasma concentrations of appetite-related hormones in adults with obesity.

PloS one·2025
Same author

From Movements to Metrics: Evaluating Explainable AI Methods in Skeleton-Based Human Activity Recognition.

Sensors (Basel, Switzerland)·2024
Same journal

Another 10 years of PLOS Computational Biology: A data-driven reflection on trends in genomics research.

PLoS computational biology·2026
Same journal

Mobility data resolution needed to inform predictive models of spatial epidemic spread from mobile phone data.

PLoS computational biology·2026
Same journal

DeepMethylation: A deep learning framework for tissue-specific DNA methylation prediction and functional variant annotation.

PLoS computational biology·2026
Same journal

Redefining and estimating the early-phase reproduction ratio for epidemic outbreaks in spatially structured populations.

PLoS computational biology·2026
Same journal

Optimized phenotype definitions boost GWAS power.

PLoS computational biology·2026
Same journal

Detection, communication, and individual identification with deep audio embeddings: A case study with North Atlantic right whales.

PLoS computational biology·2026
See all related articles

Related Experiment Video

Updated: Aug 6, 2025

Infinium Assay for Large-scale SNP Genotyping Applications
13:33

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

39.1K

Inferring feature importance with uncertainties with application to large genotype data.

Pål Vegard Johnsen1,2, Inga Strümke3,4, Mette Langaas2

  • 1SINTEF DIGITAL, Oslo, Norway.

Plos Computational Biology
|March 14, 2023
PubMed
Summary
This summary is machine-generated.

We introduce Sub-SAGE, a Shapley-value-based method for estimating feature importance and its uncertainty. This approach efficiently identifies key predictors in data-generating processes, particularly for tree-based models.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.2K

Related Experiment Videos

Last Updated: Aug 6, 2025

Infinium Assay for Large-scale SNP Genotyping Applications
13:33

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

39.1K
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.2K

Area of Science:

  • Machine Learning
  • Statistical Modeling
  • Bioinformatics

Background:

  • Estimating feature importance is crucial for understanding data-driven models and the underlying data generation process.
  • Existing methods like Shapley additive global importance (SAGE) can be computationally intensive.
  • There's a need for efficient and uncertainty-aware feature importance estimation.

Purpose of the Study:

  • To present a Shapley-value-based framework, Sub-SAGE, for inferring individual feature importance with uncertainty.
  • To develop a computationally efficient method for tree-based models, avoiding resampling.
  • To demonstrate the framework's applicability on synthetic and large-scale genotype data.

Main Methods:

  • Developed Sub-SAGE, a novel feature importance estimator building on SAGE.
  • Utilized bootstrapping for estimating uncertainty in the Sub-SAGE estimator across model types.
  • Applied the framework to tree ensemble methods and large genotype datasets.

Main Results:

  • Sub-SAGE provides efficient feature importance estimation for tree-based models without resampling.
  • Bootstrapping effectively estimates uncertainty in Sub-SAGE across various model types.
  • Demonstrated successful application in predicting feature importance for obesity using genotype data.

Conclusions:

  • Sub-SAGE offers a robust and computationally efficient method for feature importance and uncertainty estimation.
  • The framework is valuable for interpreting complex models and understanding biological data.
  • This approach enhances the explainability of machine learning models in scientific research.