Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Evolutionary Relationships through Genome Comparisons

Evolutionary Relationships through Genome Comparisons

Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...

Comparing Copy Number Variations and SNPs

Comparing Copy Number Variations and SNPs

Sequencing of the human genome has opened up several best-kept secrets of the genome. Scientists have identified thousands of genome variations that exist within a population. These variations can be a single nucleotide or a larger chromosomal variation.
Copy number variations or CNVs are the structural variations that cover more than 1kb of DNA sequence. The single nucleotide polymorphism (SNP), on the other hand, is a single nucleotide change or a point mutation that is found in more than 1%...

Multiple Comparison Tests

Multiple Comparison Tests

Multiple comparison test, abbreviated as MCT, is a post hoc analysis generally performed after comparing multiple samples with one or more tests. An MCT will help identify a significantly different sample among multiple samples or a factor among multiple factors.
It would be easy to compare two samples using a significance alpha level of 0.05. In other words, there is only one sample pair to be compared. However, it would be difficult to identify a significantly different sample if the number...

Comparison Tests

Comparison Tests

An infinite series composed of positive terms may either approach a finite value or increase without bound. Determining which outcome occurs is a central task in calculus, and comparison tests provide structured methods for making this determination. Rather than evaluating a series directly, these tests relate it to another series whose behavior is already known, allowing conclusions to be drawn through logical comparison.The direct comparison test applies to series with positive terms. If each...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Fast and compact matching statistics analytics.

Bioinformatics (Oxford, England)·2022

Same author

Why, so far, have epidemics always eventually petered out? Quasispecies theory suggests a (testable!) answer.

European biophysics journal : EBJ·2018

Same author

MiR-34a modulates ErbB2 in breast cancer.

Cell biology international·2016

Same author

ALFRED: A Practical Method for Alignment-Free Distance Computation.

Journal of computational biology : a journal of computational molecular cell biology·2016

Same author

miR-195 inhibits tumor growth and angiogenesis through modulating IRS1 in breast cancer.

Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie·2016

Same author

A Provably Efficient Algorithm for the k-Mismatch Average Common Substring Problem.

Journal of computational biology : a journal of computational molecular cell biology·2016

Same journal

The Effect of Reduced Graphene Oxide in Chitosan-Based Nanoparticles on the Enzymatic Properties of the Immobilized Enzyme.

Journal of biotechnology·2026

Same journal

High-level biosynthesis of gastrodin in engineered Escherichia coli.

Journal of biotechnology·2026

Same journal

From plasmid sequence to process design: A computational analysis of metabolism in the context of plasmid DNA manufacturing.

Journal of biotechnology·2026

Same journal

Development of an inducible cellobiohydrolase promoter and its application for enhancing the production of ganoderic acids in Ganoderma lingzhi.

Journal of biotechnology·2026

Same journal

Malic acid and allied exogenous chemicals induces desaturase gene expression and elevates PUFA production in marine microalgae Isochrysis sp.

Journal of biotechnology·2026

Same journal

Recombinant production of human papillomavirus type 16 E6 and E7 vaccine antigens in Chlamydomonas reinhardtii.

Journal of biotechnology·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 10, 2026

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Published on: December 13, 2024

Efficient tools for comparative substring analysis.

Alberto Apostolico¹, Olgert Denas, Andreas Dress

¹Accademia Nazionale dei Lincei and DEI, Universitá di Padova, Italy. axa@cc.gatech.edu

Journal of Biotechnology

|August 5, 2010

Summary

This summary is machine-generated.

This study presents an efficient method for genome analysis using substring composition, offering a faster alternative to traditional sequence alignment. This approach enables rapid calculation of genome-wide distances and phylogenetic relationships.

More Related Videos

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

Published on: September 7, 2022

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Published on: June 24, 2021

Related Experiment Videos

Last Updated: Jun 10, 2026

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Introductory Analysis and Validation of CUT&RUN Sequencing Data

Published on: December 13, 2024

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

In Vitro Selection of Aptamers to Differentiate Infectious from Non-Infectious Viruses

Published on: September 7, 2022

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Identification of Alternative Splicing and Polyadenylation in RNA-seq Data

Published on: June 24, 2021

Area of Science:

Bioinformatics
Computational Biology
Genomics

Background:

Comparative genome analysis traditionally relies on sequence alignment, which can be computationally intensive.
Substring composition methods offer an alternative for analyzing genomic distances and constructing phylogenies.
Previous work demonstrated success using 5- and 6-mers for prokaryotic phylogeny.

Purpose of the Study:

To introduce an efficient implementation for alignment-free comparative genome analysis using substring composition.
To extend the computation of composition-based distances to include all k-mers up to a maximum length K.
To demonstrate significant improvements in computational speed and resource efficiency.

Main Methods:

The implementation computes composition-based distances using all k-mers for any k up to a specified maximum length K.
Utilizes an O(L) time and space complexity algorithm, independent of K.
Applies substring statistics for genome-wide distance calculations.

Main Results:

Composition-based distances and other comparative genomics metrics can be computed in linear O(L) time and space.
The method achieves a constant time complexity irrespective of the maximum k-mer length K.
A practical test case showed a 1.5 million character comparison completed in approximately 2 seconds, drastically outperforming alignment-based methods.

Conclusions:

The developed implementation provides a highly efficient and scalable solution for alignment-free comparative genome analysis.
This approach significantly accelerates the computation of genome-based phylogenies and distances.
The method is broadly applicable for analyzing large genomic datasets and constructing evolutionary relationships.