Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Accelerating protein classification using suffix trees.

B Dorohonceanu1, C G Nevill-Manning

  • 1Computer Science Department, Rutgers University, Piscataway, NJ 07310, USA. dbogdan@cs.rutgers.edu

Proceedings. International Conference on Intelligent Systems for Molecular Biology
|September 8, 2000
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Fast probabilistic analysis of sequence function using scoring matrices.

Bioinformatics (Oxford, England)·2000
Same author

Minimal-risk scoring matrices for sequence analysis.

Journal of computational biology : a journal of computational molecular cell biology·1999
Same author

Highly specific protein sequence motifs for genome analysis.

Proceedings of the National Academy of Sciences of the United States of America·1998
Same author

Enumerating and ranking discrete motifs.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·1997
Same journal

Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000). San Diego, California, USA. August 19-23, 2000.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2001
Same journal

Analysis of gene expression data with pathway scores.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

Mining for putative regulatory elements in the yeast genome using gene expression data.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

A multiple alignment algorithm for metabolic pathway analysis using enzyme hierarchy.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
Same journal

Sequence database search using jumping alignments.

Proceedings. International Conference on Intelligent Systems for Molecular Biology·2000
See all related articles

This study introduces a novel suffix tree method to accelerate protein region searches. This approach significantly speeds up analyses by efficiently excluding large protein segments, improving computational efficiency.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics and Proteomics

Background:

  • Position-specific scoring matrices (PSSMs) are crucial for identifying conserved protein regions.
  • Current methods for PSSM searches can be computationally intensive.
  • The need for faster algorithms to analyze large protein sequence datasets is growing.

Purpose of the Study:

  • To develop and present an accelerated method for searching protein sequences using PSSMs.
  • To leverage suffix tree data structures for enhanced search efficiency.
  • To reduce the computational resources required for identifying conserved protein regions.

Main Methods:

  • Implemented a search acceleration technique utilizing a suffix tree data structure.
  • Integrated early termination strategies for scoring matrix evaluation within the suffix tree framework.

Related Experiment Videos

  • Optimized suffix tree node storage to minimize memory usage (17 bytes per input symbol).
  • Main Results:

    • The suffix tree-based method significantly accelerates PSSM searches, achieving speedups of up to tenfold.
    • The method efficiently prunes large portions of the sequence space, reducing unnecessary computations.
    • Memory requirements for the suffix tree were optimized, making the approach practical for large datasets.

    Conclusions:

    • The proposed suffix tree method offers a substantial improvement in the speed of PSSM searches.
    • This technique provides a computationally efficient solution for identifying conserved protein regions.
    • The optimized memory footprint makes this method suitable for large-scale bioinformatics analyses.