Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
Fast Fourier Transform01:10

Fast Fourier Transform

The Fast Fourier Transform (FFT) is a computational algorithm designed to compute the Discrete Fourier Transform (DFT) efficiently. By breaking down the calculations into smaller, manageable sections, the FFT significantly reduces the computational complexity involved. Direct computation of an N-point DFT requires N2 complex multiplications, whereas the FFT algorithm needs only (N/2)log⁡2N multiplications, offering a much faster performance.
The computational efficiency of the FFT becomes...
Extraction: Partition and Distribution Coefficients01:14

Extraction: Partition and Distribution Coefficients

The distribution law or Nernst's distribution law is the law that governs the distribution of a solute between two immiscible solvents. This law, also known as the partition law, states that if a solute is added to the mixture of two immiscible solvents at a constant temperature, the solute is distributed between the two solvents in such a way that the ratio of solute concentrations in the solvents remains constant at equilibrium.
For extracting a solute from an aqueous phase into an organic...
Survival Tree01:19

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a survival tree begins...
Continuous -time Fourier Transform01:11

Continuous -time Fourier Transform

The Fourier series is instrumental in representing periodic functions, offering a powerful method to decompose such functions into a sum of sinusoids. This technique, however, necessitates modification when applied to nonperiodic functions. Consider a pulse-train waveform consisting of a series of rectangular pulses. When these pulses have a finite period, they can be accurately represented by a Fourier series. Yet, as the period approaches infinity, resulting in a single, isolated pulse, the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

CIndex: compressed indexes for fast retrieval of FASTQ files.

Bioinformatics (Oxford, England)·2021
Same author

Oil Conductivity, Electric-Field-Induced Interfacial Charge Effects, and Their Influence on the Electro-Optical Response of Electrowetting Display Devices.

Micromachines·2020
Same author

Sketching algorithms for genomic data analysis and querying in a secure enclave.

Nature methods·2020
Same author

Efficient Compression and Indexing for Highly Repetitive DNA Sequence Collections.

IEEE/ACM transactions on computational biology and bioinformatics·2020
Same author

An Ultra-Fast and Parallelizable Algorithm for Finding k-Mismatch Shortest Unique Substrings.

IEEE/ACM transactions on computational biology and bioinformatics·2020
Same author

Parallel Methods for Finding k-Mismatch Shortest Unique Substrings Using GPU.

IEEE/ACM transactions on computational biology and bioinformatics·2019
Same journal

circ2DGNN: circRNA-Disease Association Prediction via Transformer-Based Graph Neural Network.

IEEE/ACM transactions on computational biology and bioinformatics·2024
Same journal

Hierarchical Hypergraph Learning in Association- Weighted Heterogeneous Network for miRNA- Disease Association Identification.

IEEE/ACM transactions on computational biology and bioinformatics·2024
Same journal

Discriminative Domain Adaption Network for Simultaneously Removing Batch Effects and Annotating Cell Types in Single-Cell RNA-Seq.

IEEE/ACM transactions on computational biology and bioinformatics·2024
Same journal

MLW-BFECF: A Multi-Weighted Dynamic Cascade Forest Based on Bilinear Feature Extraction for Predicting the Stage of Kidney Renal Clear Cell Carcinoma on Multi-Modal Gene Data.

IEEE/ACM transactions on computational biology and bioinformatics·2024
Same journal

An End-to-End Knowledge Graph Fused Graph Neural Network for Accurate Protein-Protein Interactions Prediction.

IEEE/ACM transactions on computational biology and bioinformatics·2024
Same journal

Generative Biomedical Event Extraction With Constrained Decoding Strategy.

IEEE/ACM transactions on computational biology and bioinformatics·2024
See all related articles

Related Experiment Video

Updated: May 28, 2026

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice
08:51

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice

Published on: May 10, 2019

Efficient maximal repeat finding using the burrows-wheeler transform and wavelet tree.

M Oğuzhan Külekci1, Jeffrey Scott Vitter, Bojian Xu

  • 1National Research Institute of Electronics and Cryptology, Gebze.

IEEE/ACM Transactions on Computational Biology and Bioinformatics
|October 5, 2011
PubMed
Summary
This summary is machine-generated.

This study introduces a novel, highly efficient method for identifying maximal repeats in large biological sequences. The new technique significantly reduces computational space and time, making genome analysis more accessible.

More Related Videos

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG
09:35

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG

Published on: March 10, 2017

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles
10:23

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

Related Experiment Videos

Last Updated: May 28, 2026

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice
08:51

Data Acquisition and Analysis In Brainstem Evoked Response Audiometry In Mice

Published on: May 10, 2019

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG
09:35

Automatic Detection of Highly Organized Theta Oscillations in the Murine EEG

Published on: March 10, 2017

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles
10:23

A Concoction Pipeline for Generating Molecular Operational Taxonomic Units (MOTUs) Among Riparian and Aquatic Beetles

Published on: July 11, 2025

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Identifying repetitive structures in genomes and proteins is crucial for understanding biological functions.
  • Maximal repeats are essential for data compression and sequence analysis.
  • Previous methods for finding maximal repeats are computationally expensive and require significant memory.

Purpose of the Study:

  • To develop a novel, space-efficient, and faster algorithm for finding maximal repeats in large datasets.
  • To overcome the limitations of existing suffix tree and suffix array-based methods.

Main Methods:

  • Utilizes the Burrows-Wheeler Transform (BWT) and wavelet trees.
  • Achieves significant reductions in space complexity compared to prior approaches.

Main Results:

  • Space usage is reduced to no more than three times the text size for natural language and less than double for genomic sequences.
  • The method is orders of magnitude faster than previous techniques, enabling analysis on standard hardware.
  • Successfully identified all maximal repeats in the human genome on a desktop computer in under 17 hours.

Conclusions:

  • The new method offers a practical and efficient solution for maximal repeat finding in massive datasets.
  • The open-source implementation facilitates broader application in bioinformatics and genomics research.
  • Enables advanced genomic analysis on readily available computing resources.