Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Wilcoxon Signed-Ranks Test for Matched Pairs01:09

Wilcoxon Signed-Ranks Test for Matched Pairs

194
The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...
194
Comparing Copy Number Variations and SNPs02:26

Comparing Copy Number Variations and SNPs

17.9K
Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...
17.9K
Genome-wide Association Studies-GWAS01:11

Genome-wide Association Studies-GWAS

14.0K
Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...
14.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Experiences of Patients With Atrial Fibrillation Using Technology to Personalize Self-Care Decision-Making: Interpretive Description Study.

JMIR cardio·2026
Same author

Vision transformer autoencoders captures local and non-local features in brain imaging to reveal novel genetic associations.

Communications biology·2026
Same author

Replicability of unsupervised deep learning derived image phenotypes.

bioRxiv : the preprint server for biology·2026
Same author

Genetic architecture of white matter microstructure captured by unsupervised deep representation learning of fractional anisotropy maps.

Nature communications·2026
Same author

Improving Vancomycin Therapeutic Drug Monitoring With a Deep Learning-Based Two-Compartment Predictive Model: Development and Validation Study.

JMIR AI·2026
Same author

HiFiMAP: High-resolution fast identity-by-descent mapping test.

medRxiv : the preprint server for health sciences·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: Aug 19, 2025

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

15.3K

Syllable-PBWT for space-efficient haplotype long-match query.

Victor Wang1, Ardalan Naseri1, Shaojie Zhang2

  • 1School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Bioinformatics (Oxford, England)
|November 28, 2022
PubMed
Summary
This summary is machine-generated.

We introduce Syllable-PBWT, a space-efficient haplotype matching method that significantly reduces memory usage for large biobank-scale genetic data. This novel approach enables faster genetic genealogical searches by optimizing haplotype comparison.

More Related Videos

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.3K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.2K

Related Experiment Videos

Last Updated: Aug 19, 2025

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

15.3K
Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations
08:03

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

2.3K
Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens
09:14

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

7.2K

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genetics

Background:

  • The positional Burrows-Wheeler transform (PBWT) has advanced haplotype matching in large genetic datasets.
  • Existing PBWT methods require substantial memory, limiting scalability for millions of haplotypes in biobanks.
  • Fast query searches for genetic genealogy are hindered by memory constraints of current algorithms.

Purpose of the Study:

  • To develop a space-efficient variation of the PBWT for large-scale haplotype analysis.
  • To present a novel algorithm, Syllable-Query, for efficient long match queries.
  • To overcome memory limitations of existing methods for biobank-scale genetic data.

Main Methods:

  • Proposed Syllable-PBWT, a space-efficient PBWT variant.
  • Divided haplotypes into syllables for compressed data structures.
  • Utilized polynomial rolling hash for positional substring comparison.
  • Developed the Syllable-Query algorithm for long match queries.

Main Results:

  • Syllable-Query reduced memory usage by over 100x compared to prior solutions.
  • Achieved faster runtimes due to efficient iteration and CPU cache usage from smaller data structures.
  • Demonstrated effectiveness on UK Biobank and 1000 Genomes Project data.

Conclusions:

  • Syllable-PBWT offers a significant improvement in memory efficiency for large-scale haplotype matching.
  • The Syllable-Query algorithm provides faster and more scalable genetic genealogical search.
  • This method enables analysis of modern biobank-scale genetic datasets previously limited by memory.