Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Screening large-scale association study data: exploiting interactions using random forests.

Kathryn L Lunetta1, L Brooke Hayward, Jonathan Segal

  • 1Oscient Pharmaceuticals, Inc, (formerly Genome Therapeutics Corporation), Waltham, Massachusetts, USA. klunetta@bu.edu

BMC Genetics
|December 14, 2004
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Multivariate mendelian randomization for joint inferences of correlated outcomes.

European journal of epidemiology·2026
Same author

Peripheral vascular function, including endothelium-dependent measures, and dementia risk: The Framingham Heart Study.

Alzheimer's & dementia : the journal of the Alzheimer's Association·2026
Same author

The SILVER Platter for Patients with Inflammatory Bowel Disease (IBD): An IBD Pharmacy Technician-Led Subcutaneous Biologic Home Delivery Service Is Associated with Sustained Improvement in Adherence and Active Health Management.

Digestive diseases and sciences·2026
Same author

Interactive effects of telomere length and genetic variants on Alzheimer disease risk across multiple ancestral populations.

Alzheimer's research & therapy·2026
Same author

Thymic health consequences in adults.

Nature·2026
Same author

Bayesian Mendelian randomization methods for index trait bias correction in subsequent trait genome-wide association studies.

HGG advances·2026
Same journal

Geographic distribution of sex chromosome polymorphism in Anastrepha fraterculus sp. 1 from Argentina.

BMC genetics·2020
Same journal

Development and characterization of a pupal-colour based genetic sexing strain of Anastrepha fraterculus sp. 1 (Diptera: Tephritidae).

BMC genetics·2020
Same journal

Improvement on the genetic engineering of an invasive agricultural pest insect, the cherry vinegar fly, Drosophila suzukii.

BMC genetics·2020
Same journal

Precise single base substitution in the shibire gene by CRISPR/Cas9-mediated homology directed repair in Bactrocera tryoni.

BMC genetics·2020
Same journal

Climate stress resistance in male Queensland fruit fly varies among populations of diverse geographic origins and changes during domestication.

BMC genetics·2020
Same journal

Genetic structure and symbiotic profile of worldwide natural populations of the Mediterranean fruit fly, Ceratitis capitata.

BMC genetics·2020
See all related articles

Random forest analysis effectively screens large numbers of single nucleotide polymorphisms (SNPs) in genetic studies. This method outperforms traditional tests when SNPs interact, identifying key risk markers more efficiently.

Area of Science:

  • Genetics
  • Bioinformatics
  • Computational Biology

Background:

  • Genome-wide association studies (GWAS) generate vast amounts of single nucleotide polymorphism (SNP) data.
  • Univariate screening methods may miss SNPs with interaction effects but small marginal effects.
  • Pre-specifying interactions in models is impractical for large SNP datasets.

Purpose of the Study:

  • To evaluate random forest analysis as a screening procedure for identifying risk-associated SNPs.
  • To compare the performance of random forests against univariate tests in complex disease models.

Main Methods:

  • Utilized random forest analysis, a non-parametric machine learning method.
  • Assessed SNP importance measures considering multi-locus interactions.
  • Simulated complex disease models with up to 32 loci, including genetic heterogeneity.

Related Experiment Videos

Main Results:

  • Random forest importance measure significantly outperformed the Fisher Exact test when risk SNPs interacted.
  • Performance improvement increased with a higher number of interacting SNPs.
  • Random forests performed similarly to the Fisher Exact test when SNPs did not interact.

Conclusions:

  • Random forest analysis is a superior screening tool for large-scale genetic studies with unknown SNP interactions.
  • It significantly reduces the number of SNPs requiring further investigation compared to univariate methods.
  • This approach is valuable for identifying risk-associated SNPs in complex diseases.