Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Ranks

Ranks

Unlike parametric methods, nonparametric statistics are ideal for nominal and ordinal data, requiring fewer assumptions about the population's nature or distribution. This makes nonparametric methods easier to apply and interpret, as they do not depend on parameters like mean or standard deviation. One common approach in nonparametric analysis is to sort data according to a specific criterion. For instance, we might arrange weather data from hottest to coldest days in a month or rank cities...

Sign Test for Matched Pairs

Sign Test for Matched Pairs

The sign test for matched pairs offers a robust method for comparing two paired samples, often for the effects of an intervention in one of them. This method is very useful in situations where the underlying distribution of the data is unknown. The test compares two related samples—often pre- and post-treatment measurements on the same subjects—to determine if there are significant differences in their median values.
To conduct the sign test, we first calculate the differences in...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Wilcoxon Signed-Ranks Test for Matched Pairs

Wilcoxon Signed-Ranks Test for Matched Pairs

The Wilcoxon signed-rank test for matched pairs evaluates the null hypothesis by combining the ranks of differences with their signs. It essentially tests whether the median of the differences in a population of matched pairs is zero. Since the test incorporates more information than the sign test, it generally yields more trustable conclusions. This test also does not require the data to follow a normal distribution, but two conditions must be met for it to be applicable: (1) the data must...

Signal Sequences and Sorting Receptors

Signal Sequences and Sorting Receptors

Signal sequences are short amino acid sequences that guide newly synthesized proteins to their proper location within the cell. Classical signal sequences are fifteen to sixty amino acids long and present at the N-terminus of a polypeptide chain. Each signal sequence has a conserved segment of basic residues towards their N terminus, a hydrophobic core, and a C-terminus rich in polar residues. The C-terminus also contains a signal cleavage site and features a -3 -1 sequence motif. The -3-1...

Sieve Analysis and Grading Curves

Sieve Analysis and Grading Curves

Sieve analysis is a method used to determine the particle size distribution of aggregate materials. This process involves the following steps:

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Prefix-free parsing for merging big BWTs.

International Symposium on String Processing and Information Retrieval : SPIRE ... : proceedings. SPIRE (Symposium)·2025

Same author

r-indexing the eBWT.

International Symposium on String Processing and Information Retrieval : SPIRE ... : proceedings. SPIRE (Symposium)·2024

Same author

Computing the original eBWT faster, simpler, and with less memory.

International Symposium on String Processing and Information Retrieval : SPIRE ... : proceedings. SPIRE (Symposium)·2024

Same author

On Infinite Prefix Normal Words.

Theoretical computer science·2021

Same author

Data structures based on <i>k</i>-mers for querying large collections of sequencing data sets.

Genome research·2020

Same journal

Haplotype-aware long-read error correction.

Algorithms for molecular biology : AMB·2026

Same journal

Extension of partial atom-to-atom maps: uniqueness and algorithms.

Algorithms for molecular biology : AMB·2026

Same journal

Lossless pangenome indexing using tag arrays.

Algorithms for molecular biology : AMB·2026

Same journal

Dolphyin: a combinatorial algorithm for identifying 1-Dollo phylogenies in cancer.

Algorithms for molecular biology : AMB·2026

Same journal

Probing transcription factor subsets in gene regulatory networks.

Algorithms for molecular biology : AMB·2026

Same journal

Comparing the ability of embedding methods on metabolic hypergraphs for capturing taxonomy-based features.

Algorithms for molecular biology : AMB·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 1, 2025

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

Published on: February 10, 2017

Suffix sorting via matching statistics.

Zsuzsanna Lipták¹, Francesco Masillo¹, Simon J Puglisi^2,3

¹Department of Computer Science, University of Verona, Verona, Italy.

Algorithms for Molecular Biology : AMB

|March 13, 2024

Summary

This summary is machine-generated.

We developed a novel algorithm for generalized suffix arrays in similar string collections. This method efficiently constructs suffix arrays, outperforming existing techniques for specific data types.

Keywords:

Compressed representation Data structures Efficient algorithms Generalized suffix array Matching statistics String collections

More Related Videos

Sorting of Streptomyces Cell Pellets Using a Complex Object Parametric Analyzer and Sorter

Sorting of Streptomyces Cell Pellets Using a Complex Object Parametric Analyzer and Sorter

Published on: February 13, 2014

Step-specific Sorting of Mouse Spermatids by Flow Cytometry

Step-specific Sorting of Mouse Spermatids by Flow Cytometry

Published on: December 31, 2015

Related Experiment Videos

Last Updated: Jul 1, 2025

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

A Visual Guide to Sorting Electrophysiological Recordings Using 'SpikeSorter'

Published on: February 10, 2017

Sorting of Streptomyces Cell Pellets Using a Complex Object Parametric Analyzer and Sorter

Sorting of Streptomyces Cell Pellets Using a Complex Object Parametric Analyzer and Sorter

Published on: February 13, 2014

Step-specific Sorting of Mouse Spermatids by Flow Cytometry

Step-specific Sorting of Mouse Spermatids by Flow Cytometry

Published on: December 31, 2015

Area of Science:

Bioinformatics
Computational Biology
String Algorithms

Background:

Generalized suffix arrays are crucial for analyzing large biological sequence data.
Existing methods struggle with collections of highly similar strings, leading to inefficiencies.
Efficient construction of generalized suffix arrays is vital for bioinformatics research.

Purpose of the Study:

To introduce a new, efficient algorithm for constructing generalized suffix arrays.
To address the challenge of processing collections of highly similar strings.
To improve the speed and performance of suffix array construction for specific datasets.

Main Methods:

Constructing a compressed representation of matching statistics against a reference string.
Utilizing this data structure to create a partial order of suffixes.
Employing the partial order to accelerate suffix comparisons for final generalized suffix array construction.
Developing a heuristic for rapid computation of matching statistics between two strings.

Main Results:

The proposed algorithm demonstrates competitive or superior construction times compared to existing methods on highly similar string collections.
Experimental results with the sacamats tool validate the algorithm's efficiency.
The heuristic for matching statistics computation shows potential for independent application.

Conclusions:

The new algorithm offers a significant advancement in generalized suffix array construction for similar string collections.
This method provides a faster and more efficient alternative for specific bioinformatics applications.
The sacamats tool serves as a practical implementation of the proposed algorithm.