Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Wilcoxon Signed-Ranks Test for Matched Pairs

Wilcoxon Signed-Ranks Test for Matched Pairs

The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...

Compacting Factor test

Compacting Factor test

The compacting factor test is a method used to assess the workability of concrete. It is especially suitable for concrete mixes containing aggregates up to one and a half inches in size. This test involves specialized equipment consisting of two truncated cone-shaped hoppers and a cylinder, all with polished interior surfaces to minimize friction.
The procedure begins by placing concrete into the upper hopper without any compaction. Once filled, the bottom door of this hopper is opened,...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Sign Test for Matched Pairs

Sign Test for Matched Pairs

The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in...

Fast Fourier Transform

Fast Fourier Transform

The Fast Fourier Transform (FFT) is a computational algorithm designed to compute the Discrete Fourier Transform (DFT) efficiently. By breaking down the calculations into smaller, manageable sections, the FFT significantly reduces the computational complexity involved. Direct computation of an N-point DFT requires N2 complex multiplications, whereas the FFT algorithm needs only (N/2)log⁡2N multiplications, offering a much faster performance.
The computational efficiency of the FFT becomes...

Column Efficiency: Rate Theory

Column Efficiency: Rate Theory

The rate theory of chromatography provides quantitative insight into the shapes and widths of elution bands. These bands are based on the random-walk mechanism governing molecular migration within a column. The Gaussian profile of chromatographic bands arises from the cumulative effect of random molecular motions as they progress through the column.
During elution, a solute molecule experiences numerous transitions between stationary and mobile phases, exhibiting irregular residence times in...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Population-scale Long-read Sequencing in the <i>All of Us</i> Research Program.

medRxiv : the preprint server for health sciences·2025

Same author

Space-efficient representation of genomic k-mer count tables.

Algorithms for molecular biology : AMB·2022

Same author

Set-Min Sketch: A Probabilistic Map for Power-Law Distributions with Application to <i>k</i>-Mer Annotation.

Journal of computational biology : a journal of computational molecular cell biology·2022

Same author

DIAG a Diagnostic Web Application Based on Lung CT Scan Images and Deep Learning.

Studies in health technology and informatics·2021

Same author

A framework for space-efficient variable-order Markov models.

Bioinformatics (Oxford, England)·2019

Same author

A framework for space-efficient read clustering in metagenomic samples.

BMC bioinformatics·2017

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 4, 2025

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

Fast and compact matching statistics analytics.

Fabio Cunial¹, Olgert Denas², Djamal Belazzougui³

¹Max Planck Institute for Molecular Cell Biology and Genetics (MPI-CBG and CSBD), Dresden 01307, Germany.

Bioinformatics (Oxford, England)

|February 8, 2022

Summary

This summary is machine-generated.

New tools enable faster, memory-efficient genome sequence comparison using matching statistics. This facilitates whole-genome phylogenies and structural rearrangement detection for large-scale genomic data analysis.

More Related Videos

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: Oct 4, 2025

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Informatic Analysis of Sequence Data from Batch Yeast 2-Hybrid Screens

Published on: June 28, 2018

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Computational Biology
Bioinformatics
Genomics

Background:

The increasing size of assembled genomes and the rise of pan-genome initiatives necessitate faster and more memory-efficient methods for sequence comparison.
Matching statistics is a valuable technique for whole-genome phylogenies and structural rearrangement detection due to its suitability for fast implementations.
Existing matching statistics implementations are limited by single-core processing, high memory usage, and inefficient output analysis for local sequence similarities.

Purpose of the Study:

To develop practical tools for computing and analyzing matching statistics between large-scale strings with improved speed and reduced memory footprint.
To enable efficient exploration of local sequence similarities within large genomic datasets.

Main Methods:

Designed a parallel algorithm for shared-memory machines to accelerate matching statistics computation.
Developed a lossy compression scheme to reduce the memory requirements of the matching statistics array.
Implemented efficient range-maximum and range-sum queries for analyzing compact matching statistics representations.

Main Results:

The parallel algorithm achieved a 30-fold speedup using 48 cores on challenging datasets.
The compression scheme reduced the matching statistics array size to 0.2–0.8 bits per character, with variants reaching 0.04 bits per character.
Range queries on compact representations were performed in tens of milliseconds, enabling detailed local similarity analysis.

Conclusions:

The developed toolkit makes the construction, storage, and analysis of matching statistics practical for large-scale genome comparisons.
These advancements may unlock new applications in comparative genomics by facilitating the analysis of multiple large genome pairs.
The tools offer significant improvements over state-of-the-art methods in terms of speed and memory efficiency.