Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Maxam-Gilbert Sequencing

Maxam-Gilbert Sequencing

In the same year as the discovery of the Sanger sequencing method, another group of scientists, Allan Maxam and Walter Gilbert, demonstrated their chemical-cleavage method for DNA sequencing. The Maxam-Gilbert method relies on using different chemicals that can cleave the DNA sequence at specific sites, the separation of resulting DNA fragments of variable size using electrophoresis, and deciphering the DNA sequence from the resulting gel bands.
Challenges of the Maxam-Gilbert Method
The...

Sign Test for Matched Pairs

Sign Test for Matched Pairs

The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in...

Wilcoxon Signed-Ranks Test for Matched Pairs

Wilcoxon Signed-Ranks Test for Matched Pairs

The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...

DNA Base Pairing

DNA Base Pairing

Erwin Chargaff’s rules on DNA equivalence paved the way for the discovery of base pairing in DNA. Chargaff’s rules state that in a double-stranded DNA molecule,

Modern Molecular Taxonomy

Modern Molecular Taxonomy

Advancements in molecular biology have revolutionized the identification and characterization of bacteria, with multiple methods leveraging DNA sequencing for enhanced precision. As sequencing technologies improve and costs decline, these approaches are increasingly used in clinical, environmental, and evolutionary studies.Multilocus Sequence Typing (MLST) examines several housekeeping genes, essential chromosomal genes encoding cellular functions, to distinguish strains. Approximately...

Wald-Wolfowitz Runs Test I

Wald-Wolfowitz Runs Test I

The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Self-organizing maps for allele specific expression data reconstruction and identification of anomalous genomic regions.

Frontiers in bioinformatics·2026

Same author

A two-phase clustering procedure based on allele specific expression.

BMC bioinformatics·2026

Same author

Space-time Trade-offs for the LCP Array of Wheeler DFAs.

International Symposium on String Processing and Information Retrieval : SPIRE ... : proceedings. SPIRE (Symposium)·2024

Same author

Computing matching statistics on Wheeler DFAs.

Proceedings. Data Compression Conference·2024

Same author

gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections.

Algorithms for molecular biology : AMB·2020

Same author

Variable-order reference-free variant discovery with the Burrows-Wheeler Transform.

BMC bioinformatics·2020

Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026

Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026

Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026

Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026

Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026

Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 11, 2026

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

Fast randomized approximate string matching with succinct hash data structures.

Alberto Policriti, Nicola Prezza

BMC Bioinformatics

|June 9, 2015

Summary

This summary is machine-generated.

This study introduces dB-hash, a novel data structure for next-generation sequencing (NGS) data alignment. BW-ERNE, its implementation, achieves high sensitivity and speed with reduced memory usage, addressing key challenges in genomic data analysis.

More Related Videos

Simultaneous Mapping and Quantitation of Ribonucleotides in Human Mitochondrial DNA

Simultaneous Mapping and Quantitation of Ribonucleotides in Human Mitochondrial DNA

Published on: November 14, 2017

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Published on: August 16, 2017

Related Experiment Videos

Last Updated: Apr 11, 2026

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Optimization of Synthetic Proteins: Identification of Interpositional Dependencies Indicating Structurally and/or Functionally Linked Residues

Published on: July 14, 2015

Simultaneous Mapping and Quantitation of Ribonucleotides in Human Mitochondrial DNA

Simultaneous Mapping and Quantitation of Ribonucleotides in Human Mitochondrial DNA

Published on: November 14, 2017

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Creating and Applying a Reference to Facilitate the Discussion and Classification of Proteins in a Diverse Group

Published on: August 16, 2017

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Modern next-generation sequencing (NGS) demands efficient algorithms for aligning large genomic datasets.
Existing data structures for read alignment present a trade-off between memory usage, speed, and accuracy.
Burrows-Wheeler transform indexes offer low memory but reduced sensitivity, while hash-based indexes provide high sensitivity at the cost of significant memory consumption.

Purpose of the Study:

To develop a novel data structure that combines the advantages of both Burrows-Wheeler transform and hash-based indexes for NGS read alignment.
To achieve high sensitivity and speed in sequence alignment while maintaining a significantly reduced memory footprint.

Main Methods:

Introduced Hamming-aware hash functions, which are homomorphisms on de Bruijn graphs.
Developed a hash index represented in linear space with logarithmic slowdown for lookups.
The data structure, named dB-hash, does not require input compression.

Main Results:

The implementation BW-ERNE maintains the high sensitivity and speed of its predecessor ERNE.
BW-ERNE drastically reduces space consumption compared to previous hash-based methods.
Extensive comparisons on simulated and real NGS data confirm BW-ERNE's ability to achieve both small space and high sensitivity.

Conclusions:

Combining hashing and succinct indexing techniques offers a solution for NGS data alignment where space and speed are critical.
BW-ERNE provides competitive performance and accuracy with a memory footprint comparable to popular compressed indexes.
This approach overcomes the typical trade-off between throughput, memory, and accuracy in genomic data analysis.