Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Statistical distance between texts and filtration methods in sequence comparison.

P A Pevzner1

  • 1Department of Mathematics, University of Southern California, Los Angeles 90089-1113.

Computer Applications in the Biosciences : CABIOS
|April 1, 1992
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Microinversions in mammalian evolution.

Proceedings of the National Academy of Sciences of the United States of America·2006
Same author

Age-related changes in human crystallins determined from comparative analysis of post-translational modifications in young and aged lens: does deamidation contribute to crystallin insolubility?

Journal of proteome research·2006
Same author

Finding motifs in the twilight zone.

Bioinformatics (Oxford, England)·2002
Same author

Subtle motifs: defining the limits of motif finding algorithms.

Bioinformatics (Oxford, England)·2002
Same author

Finding weak motifs in DNA sequences.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2002
Same author

Assembling puzzles from preassembled blocks.

Genome research·2001
Same journal

DCA: an efficient implementation of the divide-and-conquer approach to simultaneous multiple sequence alignment.

Computer applications in the biosciences : CABIOS·1998
Same journal

Two applications to facilitate the viewing of database search result files on the Macintosh.

Computer applications in the biosciences : CABIOS·1998
Same journal

BioWish: a molecular biology command extension to Tcl/Tk.

Computer applications in the biosciences : CABIOS·1998
Same journal

The Sequence Alerting Server--a new WEB server.

Computer applications in the biosciences : CABIOS·1998
Same journal

A software tool for the analysis of mass spectrometric disulfide mapping experiments.

Computer applications in the biosciences : CABIOS·1998
Same journal

SAMBA: hardware accelerator for biological sequence comparison.

Computer applications in the biosciences : CABIOS·1998
See all related articles

Rapid similarity searches are crucial for long sequences. This study introduces a statistical distance filtration method, showing its efficiency increases exponentially with tuple length (l) for DNA database searches.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Sequence Analysis

Background:

  • Dynamic programming algorithms for sequence similarity have quadratic complexity, limiting their use on long sequences.
  • Efficient filtration methods are necessary to quickly identify and discard sequences with low similarity levels.

Purpose of the Study:

  • To provide theoretical foundations for a filtration method based on statistical text distance.
  • To introduce and evaluate the concept of filtration efficiency for sequence analysis.

Main Methods:

  • Development of a filtration method utilizing statistical distance between sequences.
  • Theoretical estimation and comparison of the efficiency of various filtration techniques.
  • Analysis of statistical l-tuple filtration for DNA database searching.

Related Experiment Videos

Main Results:

  • The efficiency of statistical l-tuple filtration is linked to extending the standard four-letter DNA alphabet.
  • Filtration efficiency demonstrates exponential growth with increasing tuple length (l).
  • A formula for estimating filtration parameters has been derived.

Conclusions:

  • Statistical distance-based filtration offers an efficient approach for rapid similarity searching in large sequence datasets.
  • The proposed method's effectiveness, particularly for DNA databases, scales favorably with increased tuple length.
  • The derived formula aids in optimizing filtration strategies for sequence analysis.