Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

An efficient algorithm for identifying matches with errors in multiple long molecular sequences.

M Y Leung1, B E Blaisdell, C Burge

  • 1Division of Mathematics, Computer Science and Statistics, University of Texas, San Antonio 78249-0664.

Journal of Molecular Biology
|October 20, 1991
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Occupational asthma in teachers.

Occupational medicine (Oxford, England)·2022
Same author

Three-dimensional evaluation of mandibular asymmetry: a new classification and three-dimensional cephalometric analysis.

International journal of oral and maxillofacial surgery·2018
Same author

Occupational health management system: A study of expatriate construction professionals.

Accident; analysis and prevention·2015
Same author

The relationship between HLA antigens and Bermuda grass hayfever.

Immunogenetics·2011
Same author

Vitamins as asthmagens in the workplace.

The European respiratory journal·2008
Same author

Retained fecalith: laparoscopic removal.

Surgical laparoscopy, endoscopy & percutaneous techniques·2003
Same journal

UPF3A and UPF3B shape the transcriptome cooperatively yet oppose cell function.

Journal of molecular biology·2026
Same journal

Antibody-secreting cells integrate efficient NMD with non‑canonical UPR signaling to maintain proteostasis and support massive immunoglobulin synthesis.

Journal of molecular biology·2026
Same journal

Small molecule stabilization of diverse amyloidogenic immunoglobulin light chains revealed by hydrogen-deuterium exchange mass spectrometry.

Journal of molecular biology·2026
Same journal

UPF1 at Work: Structural and Mechanistic Insights Into a Master Regulator of Nonsense-Mediated mRNA Decay.

Journal of molecular biology·2026
Same journal

Structural basis for the pro-amyloidogenic action and ligand binding of a novel W72R variant of human apolipoprotein A-I.

Journal of molecular biology·2026
Same journal

Cryo-EM Structure of the C. elegans Septin Tetramer Reveals a Revised Architecture and Conserved Positional Orthology.

Journal of molecular biology·2026
See all related articles

This study introduces an efficient algorithm for identifying patterns and repeats in large molecular sequence data, even with errors. The method uses hashing and linked lists, showing near-linear scaling for memory and run time with sequence length.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Analyzing large molecular sequence datasets is computationally intensive.
  • Identifying sequence repeats and similarities is crucial for understanding biological function.

Purpose of the Study:

  • To develop an efficient algorithm for detecting word relations, including matches and repeats, in long molecular sequences.
  • To accommodate errors within the sequence data during analysis.

Main Methods:

  • The algorithm employs hashing on fixed-size words.
  • A linked list structure connects all occurrences of identical words.
  • The approach is designed for large-scale data analysis.

Main Results:

Related Experiment Videos

  • The algorithm demonstrates efficiency in finding sequence patterns and repeats.
  • Average memory and run time scale almost linearly with total sequence length.
  • Performance was evaluated on an Escherichia coli DNA sequence database.
  • Conclusions:

    • The developed algorithm provides an efficient solution for sequence analysis.
    • The linear scaling makes it suitable for very large genomic datasets.
    • This method facilitates the study of molecular sequence relationships.