Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Wald-Wolfowitz Runs Test I

Wald-Wolfowitz Runs Test I

The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...

Fast Fourier Transform

Fast Fourier Transform

The Fast Fourier Transform (FFT) is a computational algorithm designed to compute the Discrete Fourier Transform (DFT) efficiently. By breaking down the calculations into smaller, manageable sections, the FFT significantly reduces the computational complexity involved. Direct computation of an N-point DFT requires N2 complex multiplications, whereas the FFT algorithm needs only (N/2)log⁡2N multiplications, offering a much faster performance.
The computational efficiency of the FFT becomes...

Extraction: Partition and Distribution Coefficients

Extraction: Partition and Distribution Coefficients

The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an organic...

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a survival tree begins...

Continuous -time Fourier Transform

Continuous -time Fourier Transform

The Fourier series is instrumental in representing periodic functions, offering a powerful method to decompose such functions into a sum of sinusoids. This technique, however, necessitates modification when applied to nonperiodic functions. Consider a pulse-train waveform consisting of a series of rectangular pulses. When these pulses have a finite period, they can be accurately represented by a Fourier series. Yet, as the period approaches infinity, resulting in a single, isolated pulse, the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

CIndex: compressed indexes for fast retrieval of FASTQ files.

Bioinformatics (Oxford, England)·2021

Same author

Oil Conductivity, Electric-Field-Induced Interfacial Charge Effects, and Their Influence on the Electro-Optical Response of Electrowetting Display Devices.

Micromachines·2020

Same author

Sketching algorithms for genomic data analysis and querying in a secure enclave.

Nature methods·2020

Same author

Efficient Compression and Indexing for Highly Repetitive DNA Sequence Collections.

IEEE/ACM transactions on computational biology and bioinformatics·2020

Same author

An Ultra-Fast and Parallelizable Algorithm for Finding k-Mismatch Shortest Unique Substrings.

IEEE/ACM transactions on computational biology and bioinformatics·2020

Same author

Parallel Methods for Finding k-Mismatch Shortest Unique Substrings Using GPU.

IEEE/ACM transactions on computational biology and bioinformatics·2019

Same journal

circ2DGNN: circRNA-Disease Association Prediction via Transformer-Based Graph Neural Network.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

Hierarchical Hypergraph Learning in Association- Weighted Heterogeneous Network for miRNA- Disease Association Identification.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

Discriminative Domain Adaption Network for Simultaneously Removing Batch Effects and Annotating Cell Types in Single-Cell RNA-Seq.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

MLW-BFECF: A Multi-Weighted Dynamic Cascade Forest Based on Bilinear Feature Extraction for Predicting the Stage of Kidney Renal Clear Cell Carcinoma on Multi-Modal Gene Data.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.

IEEE/ACM transactions on computational biology and bioinformatics·2024

Same journal

Generative Biomedical Event Extraction With Constrained Decoding Strategy.

IEEE/ACM transactions on computational biology and bioinformatics·2024

See all related articles

Search research articles

Related Experiment Video

Updated: May 28, 2026

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice

Published on: May 10, 2019

Efficient maximal repeat finding using the burrows-wheeler transform and wavelet tree.

M Oğuzhan Külekci¹, Jeffrey Scott Vitter, Bojian Xu

¹National Research Institute of Electronics and Cryptology, Gebze.

IEEE/ACM Transactions on Computational Biology and Bioinformatics

|October 5, 2011

Summary

This summary is machine-generated.

This study introduces a novel, highly efficient method for identifying maximal repeats in large biological sequences. The new technique significantly reduces computational space and time, making genome analysis more accessible.

More Related Videos

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG

Published on: March 10, 2017

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

Related Experiment Videos

Last Updated: May 28, 2026

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice

Published on: May 10, 2019

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG

Published on: March 10, 2017

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Identifying repetitive structures in genomes and proteins is crucial for understanding biological functions.
Maximal repeats are essential for data compression and sequence analysis.
Previous methods for finding maximal repeats are computationally expensive and require significant memory.

Purpose of the Study:

To develop a novel, space-efficient, and faster algorithm for finding maximal repeats in large datasets.
To overcome the limitations of existing suffix tree and suffix array-based methods.

Main Methods:

Utilizes the Burrows-Wheeler Transform (BWT) and wavelet trees.
Achieves significant reductions in space complexity compared to prior approaches.

Main Results:

Space usage is reduced to no more than three times the text size for natural language and less than double for genomic sequences.
The method is orders of magnitude faster than previous techniques, enabling analysis on standard hardware.
Successfully identified all maximal repeats in the human genome on a desktop computer in under 17 hours.

Conclusions:

The new method offers a practical and efficient solution for maximal repeat finding in massive datasets.
The open-source implementation facilitates broader application in bioinformatics and genomics research.
Enables advanced genomic analysis on readily available computing resources.