Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Single Nucleotide Polymorphisms-SNPs

Single Nucleotide Polymorphisms-SNPs

A single nucleotide polymorphism or SNP is a single nucleotide variation at a specific genomic position in a large population. It is the most prevalent type of sequence variation found in the human genome. Point mutations that occur in more than 1% of the population qualify as SNPs. These are present once every 1000 nucleotides on an average in the human genome. Replacement of a purine with another purine (A/G) or a pyrimidine with another pyrimidine (C/T) is known as a transition. In contrast,...

Genome-wide Association Studies-GWAS

Genome-wide Association Studies-GWAS

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Multiple Allele Traits

Multiple Allele Traits

The Concept of Multiple Allelism

Principles of Pharmacogenetics: Types of Genetic Variants

Principles of Pharmacogenetics: Types of Genetic Variants

The human genome is over 99.9% identical between individuals, yet genetic differences exist at millions of bases. The human genome contains approximately 3 million variant positions per individual, many of which are heterozygous, contributing to genetic diversity and individual traits. Genetic variations include single-nucleotide polymorphisms (SNPs), insertions, deletions, and copy number variations (CNVs).SNPs, the most common variation, involve single-base changes in DNA. These can be...

Polygenic Traits

Polygenic Traits

When more than one gene is responsible for a given phenotype, the trait is considered polygenic. Human height is a polygenic trait. Studies have uncovered hundreds of loci that influence height, and there are believed to be many more. Due to the high number of genes involved, as well as environmental and nutritional factors, height varies significantly within a given population. The distribution of height forms a bell-shaped curve, with relatively few individuals in the population at the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Trans-ancestry genome-wide association meta-analysis of antidepressant response to selective serotonin reuptake inhibitors in clinical studies of depression.

medRxiv : the preprint server for health sciences·2026

Same author

Correction: Thyroid peroxidase antibodies in bipolar disorder: implications for clinical sub-phenotypes and lithium response.

International journal of bipolar disorders·2026

Same author

The Distinct Role of Family History and Polygenic Risk Scores of Psychiatric Disorders on Lithium Response in Bipolar Disorder.

Bipolar disorders·2026

Same author

Accelerated biological aging in bipolar disorder as determined by artificial intelligence-based electrocardiographic assessment.

Journal of affective disorders·2026

Same author

Genotype epigenome phenotype integration reveals peripheral immune contributions to type I bipolar disorder.

Nature communications·2026

Same author

Comparing Weightlifting Performances of Masters Athletes Across Age, Body Mass, and Sex From 2000 to 2025.

International journal of sports physiology and performance·2026

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Videos

SNP interaction detection with Random Forests in high-dimensional genetic data.

Stacey J Winham¹, Colin L Colby, Robert R Freimuth

¹Department of Health Sciences Research, Mayo Clinic, 200 First Street Southwest, Rochester, MN 55905, USA. winham.stacey@mayo.edu

BMC Bioinformatics

|July 17, 2012

Summary

This summary is machine-generated.

Random Forests (RF) can identify gene interactions in low-dimensional data. However, in high-dimensional genome-wide association studies, RF variable importance measures often fail to detect these interactions, limiting their use as a filtering technique.

Related Experiment Videos

Area of Science:

Genetics
Bioinformatics
Computational Biology

Background:

Genome-wide association studies (GWAS) aim to identify genetic variants linked to complex human traits.
Univariate analysis in GWAS often overlooks gene-gene interactions, a key factor in complex trait etiology.
Random Forests (RF) offer a data-mining approach capable of modeling interactions in high-dimensional datasets.

Purpose of the Study:

To investigate the effectiveness of RF variable importance measures in detecting gene-gene interactions within high-dimensional genetic data.
To compare the performance of RF-based filtering against traditional p-values from univariate logistic regression, especially under increasing data dimensionality.

Main Methods:

Utilized Random Forests (RF) analysis to assess variable importance for single nucleotide polymorphisms (SNPs).
Evaluated the power of RF variable importance rankings to detect gene-gene interaction effects.
Compared RF performance with p-values derived from univariate logistic regression in simulated high-dimensional datasets.

Main Results:

RF successfully identified interactions in low-dimensional datasets.
As data dimensionality increased, the detection probability for interacting SNPs decreased more rapidly than for non-interacting SNPs.
RF variable importance measures in high-dimensional data primarily captured marginal effects, not interaction effects.

Conclusions:

RF is a valuable technique for analyzing multiple variables simultaneously, extending beyond univariate methods.
RF variable importance measures are not effective for detecting gene-gene interactions in high-dimensional genomic data without a strong marginal effect.
The utility of RF as a filtering approach for identifying interactions in genome-wide data is limited.