Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Wilcoxon Signed-Ranks Test for Matched Pairs

Wilcoxon Signed-Ranks Test for Matched Pairs

The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Genome-wide Association Studies-GWAS

Genome-wide Association Studies-GWAS

Genome-wide association studies or GWAS are used to identify whether common SNPs are associated with certain diseases. Suppose specific SNPs are more frequently observed in individuals with a particular disease than those without the disease. In that case, those SNPs are said to be associated with the disease. Chi-square analysis is performed to check the probability of the allele likely to be associated with the disease.
GWAS does not require the identification of the target gene involved in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Experiences of Patients With Atrial Fibrillation Using Technology to Personalize Self-Care Decision-Making: Interpretive Description Study.

JMIR cardio·2026

Same author

Vision transformer autoencoders captures local and non-local features in brain imaging to reveal novel genetic associations.

Communications biology·2026

Same author

Replicability of unsupervised deep learning derived image phenotypes.

bioRxiv : the preprint server for biology·2026

Same author

Genetic architecture of white matter microstructure captured by unsupervised deep representation learning of fractional anisotropy maps.

Nature communications·2026

Same author

Improving Vancomycin Therapeutic Drug Monitoring With a Deep Learning-Based Two-Compartment Predictive Model: Development and Validation Study.

JMIR AI·2026

Same author

HiFiMAP: High-resolution fast identity-by-descent mapping test.

medRxiv : the preprint server for health sciences·2026

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Aug 19, 2025

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Syllable-PBWT for space-efficient haplotype long-match query.

Victor Wang¹, Ardalan Naseri¹, Shaojie Zhang²

¹School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

Bioinformatics (Oxford, England)

|November 28, 2022

Summary

This summary is machine-generated.

We introduce Syllable-PBWT, a space-efficient haplotype matching method that significantly reduces memory usage for large biobank-scale genetic data. This novel approach enables faster genetic genealogical searches by optimizing haplotype comparison.

More Related Videos

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

Related Experiment Videos

Last Updated: Aug 19, 2025

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Heuristic Mining of Hierarchical Genotypes and Accessory Genome Loci in Bacterial Populations

Published on: December 7, 2021

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

Area of Science:

Bioinformatics
Computational Biology
Genetics

Background:

The positional Burrows-Wheeler transform (PBWT) has advanced haplotype matching in large genetic datasets.
Existing PBWT methods require substantial memory, limiting scalability for millions of haplotypes in biobanks.
Fast query searches for genetic genealogy are hindered by memory constraints of current algorithms.

Purpose of the Study:

To develop a space-efficient variation of the PBWT for large-scale haplotype analysis.
To present a novel algorithm, Syllable-Query, for efficient long match queries.
To overcome memory limitations of existing methods for biobank-scale genetic data.

Main Methods:

Proposed Syllable-PBWT, a space-efficient PBWT variant.
Divided haplotypes into syllables for compressed data structures.
Utilized polynomial rolling hash for positional substring comparison.
Developed the Syllable-Query algorithm for long match queries.

Main Results:

Syllable-Query reduced memory usage by over 100x compared to prior solutions.
Achieved faster runtimes due to efficient iteration and CPU cache usage from smaller data structures.
Demonstrated effectiveness on UK Biobank and 1000 Genomes Project data.

Conclusions:

Syllable-PBWT offers a significant improvement in memory efficiency for large-scale haplotype matching.
The Syllable-Query algorithm provides faster and more scalable genetic genealogical search.
This method enables analysis of modern biobank-scale genetic datasets previously limited by memory.