Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Methods to impute missing genotypes for population data.

Zhaoxia Yu1, Daniel J Schaid

  • 1Department of Statistics, University of California, Irvine, CA 92697, USA. yu.zhaoxia@ics.uci.edu

Human Genetics
|September 14, 2007
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Performance of Polygenic Risk Scores for Atherosclerotic Cardiovascular Disease in the All of Us Program.

Circulation. Genomic and precision medicine·2026
Same author

A decision-oriented framework for genomic testing across the prostate cancer continuum.

Cancer genetics·2026
Same author

Polygenic scores for risk of pancreatic ductal adenocarcinoma: evaluation of novel and published models.

NPJ precision oncology·2026
Same author

Personalized Environment and Genes Study (PEGS) Dataset-a resource for genomic, exposomic, and geospatial data.

Scientific data·2026
Same author

Evaluating the potential of acupuncture for Alzheimer's disease treatment: A meta-analysis and systematic review of mouse model studies.

Translational psychiatry·2026
Same author

Navigating Cognitive Maps: Statistical Analysis of 3D Path Data in Minecraft.

Psychometrika·2026
Same journal

AI in variant analysis: fast track to genetic diagnoses.

Human genetics·2026
Same journal

Combined family-based association and linkage analyses in families affected by attention-deficit hyperactivity disorder.

Human genetics·2026
Same journal

Investigating the shared genetic architecture between selective immunoglobulin A deficiency and autoimmune diseases.

Human genetics·2026
Same journal

ARHI as a key regulator of EMT and metastasis in pancreatic cancer via the Notch-1 pathway.

Human genetics·2026
Same journal

Large-scale mitogenome analysis reveals complex maternal genetic connections between Sino-Tibetan- and Altaic-speaking populations.

Human genetics·2026
Same journal

Correction: A comprehensive and accessible model for co-segregation analysis in BRCA1, BRCA2, and PALB2 variant classification.

Human genetics·2026
See all related articles

Missing genetic data in large studies can affect analysis. fastPHASE and LM.lars showed the best performance for inferring missing genotypes, with fastPHASE having the lowest error rates.

Area of Science:

  • Genetics
  • Bioinformatics
  • Statistical Genomics

Background:

  • Large-scale genotyping studies frequently encounter missing genetic markers.
  • Missing data complicates association analyses, leading to reduced statistical power and biased results.

Purpose of the Study:

  • To evaluate and compare the accuracy of eight different methods for inferring missing genotypes.
  • To identify the most effective imputation methods for large-scale genetic studies.

Main Methods:

  • Haplotype reconstruction: Expectation-Maximization (EM), fastPHASE.
  • K-nearest neighbor: k-nearest neighbor (KNN), weighted k-nearest neighbor (wtKNN).
  • Linear regression: LM.back, LM.lars, LM.svd.
  • Regression tree: Rtree.

Related Experiment Videos

Main Results:

  • fastPHASE demonstrated the lowest error rates across various panels and marker densities.
  • LM.lars provided highly accurate genotype imputation, performing better than other regression and KNN methods.
  • The performance of methods varied depending on the specific dataset and parameters used.

Conclusions:

  • fastPHASE is a highly accurate method for imputing missing genotypes in large-scale studies.
  • LM.lars offers a competitive alternative with strong performance.
  • Method selection should consider study-specific factors for optimal genotype imputation.