Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Incomplete Dominance01:43

Incomplete Dominance

29.7K
Gregor Mendel's work (1822 - 1884) was primarily focused on pea plants. Through his initial experiments, he determined that every gene in a diploid cell has two variants called alleles inherited from each parent. He suggested that amongst these two alleles, one allele is dominant in character and the other recessive. The combination of alleles determines the phenotype of a gene in an organism.
29.7K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

15.3K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
15.3K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Fast phenotype simulation for genotype representation graphs.

Bioinformatics advances·2026
Same author

Signatures of selective sweeps in continuous-space populations.

Genetics·2025
Same author

IGD: A simple, efficient genotype data format.

bioRxiv : the preprint server for biology·2025
Same author

Enabling efficient analysis of biobank-scale data with genotype representation graphs.

Nature computational science·2024
Same author

Signatures of selective sweeps in continuous-space populations.

bioRxiv : the preprint server for biology·2024
Same author

The lingering effects of Neanderthal introgression on human complex traits.

eLife·2023
Same journal

Region-aware bridge modeling enables interpretable mesoscale representation of spatial transcriptomic tissue sections.

Bioinformatics advances·2026
Same journal

Microbiome differential abundance methodologies to detect relevant taxa associated with chemotherapy toxicity rate in colorectal cancer.

Bioinformatics advances·2026
Same journal

maldipickr dereplicates microbial MALDI-TOF spectra to facilitate multiplexed isolation.

Bioinformatics advances·2026
Same journal

RAM-MSA: an anytime memory-bounded method for exact multiple sequence alignment using path finding.

Bioinformatics advances·2026
Same journal

Interpretable machine learning for low-sample multi-omics: a case study of ferret vaccine response.

Bioinformatics advances·2026
Same journal

DeepTaxa: a hybrid CNN-BERT framework for 16S rRNA taxonomic classification.

Bioinformatics advances·2026
See all related articles

Related Experiment Video

Updated: Jan 17, 2026

Infinium Assay for Large-scale SNP Genotyping Applications
13:33

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

39.8K

IGD: a simple, efficient genotype data format.

Drew DeHaas1, Xinzhu Wei1

  • 1Department of Computational Biology, Cornell University, Ithaca, NY 14850, United States.

Bioinformatics Advances
|September 22, 2025
PubMed
Summary
This summary is machine-generated.

Indexable Genotype Data (IGD) is a new, efficient file format for genotype data. It offers significant speed and size improvements over existing formats like VCF.gz for large-scale population genetics research.

More Related Videos

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.6K
A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

18.3K

Related Experiment Videos

Last Updated: Jan 17, 2026

Infinium Assay for Large-scale SNP Genotyping Applications
13:33

Infinium Assay for Large-scale SNP Genotyping Applications

Published on: November 19, 2013

39.8K
Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry
05:53

Candidate Gene Testing in Clinical Cohort Studies with Multiplexed Genotyping and Mass Spectrometry

Published on: June 21, 2018

10.6K
A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

18.3K

Area of Science:

  • Genomics
  • Bioinformatics
  • Computational Biology

Background:

  • Existing file formats for genotype data are often complex and inefficient.
  • Limited programming language support hinders research scalability.
  • There is a need for a simple, fast, and small file format for population genetics.

Purpose of the Study:

  • To introduce the Indexable Genotype Data (IGD) file format.
  • To provide a simple, efficient, and scalable solution for storing genotype data.
  • To facilitate research in population genetics and statistical genomics.

Main Methods:

  • Developed a new uncompressed binary file format: Indexable Genotype Data (IGD).
  • Implemented IGD reading/writing in Python with under 350 lines of code.
  • Created C++ library and conversion tools for VCF.gz to IGD.

Main Results:

  • IGD is over 100x faster than vcf.gz for biobank-scale whole-genome sequence data.
  • IGD files are 3.5x smaller than vcf.gz.
  • The Python implementation demonstrates the format's simplicity and ease of use.

Conclusions:

  • IGD offers a significant improvement in speed and size for genotype data storage.
  • The format's simplicity facilitates its adoption and integration into research workflows.
  • IGD supports highly scalable statistical and population genetics methods.