Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships

S E Brenner1, C Chothia, T J Hubbard

  • 1MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, United Kingdom. brenner@hyper.stanford.edu

Proceedings of the National Academy of Sciences of the United States of America
|May 30, 1998
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Transcriptomic variation of pharmacogenes in multiple human tissues and lymphoblastoid cell lines.

The pharmacogenomics journal·2016
Same author

The transcriptional landscape of the mammalian genome.

Science (New York, N.Y.)·2005
Same author

Effect of strength and proprioception training on eversion to inversion strength ratios in subjects with unilateral functional ankle instability.

British journal of sports medicine·2003
Same author

Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs.

Nature·2002
Same author

Prediction targets of CASP4.

Proteins·2002
Same author

Assessment of novel fold targets in CASP4: predictions of three-dimensional structures, secondary structures, and interresidue contacts.

Proteins·2002
Same journal

Chemotactic self-organization captures the dynamics of mammalian hair follicle patterning.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Tomographic imaging of superconducting order using particle-hole interference.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Inhibitory potential of autologous neutralizing antibodies sets quantitative limits on the rebound-competent HIV-1 reservoir.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Inferring epidemiological parameters under an infectious phylogeography model with visitor dynamics.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Analytical modeling for suction cup designs for skin-interfaced wearable devices.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same journal

Improving cell-free metabolism through direct integration of artificial respiratory chains.

Proceedings of the National Academy of Sciences of the United States of America·2026
See all related articles

Statistical scores significantly improve protein sequence comparison accuracy. SSEARCH and FASTA E-values are reliable, unlike BLAST and WU-BLAST2 P-values, enhancing protein relationship detection.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Structural Biology

Background:

  • Protein sequence comparison is crucial for understanding protein function and evolution.
  • Reliable assessment of pairwise sequence comparison methods requires well-characterized protein relationships.
  • The Structural Classification of Proteins (SCOP) database provides a reliable benchmark for protein relationships based on structure and function.

Purpose of the Study:

  • To evaluate the performance of common pairwise sequence comparison algorithms and their scoring schemes.
  • To determine the accuracy of statistical scores in assessing protein sequence similarity.
  • To identify the most effective methods for detecting evolutionary relationships between proteins.

Main Methods:

  • Assessed protein sequence comparison programs: BLAST, WU-BLAST2, FASTA, and SSEARCH.

Related Experiment Videos

  • Utilized proteins with known relationships from the SCOP database for evaluation.
  • Compared performance based on statistical scores (E-value, P-value) versus raw scores and percentage identity.
  • Main Results:

    • Using statistical scores significantly reduces the error rate of all tested algorithms.
    • SSEARCH and FASTA E-values accurately reflect the number of false positives.
    • BLAST and WU-BLAST2 P-values overestimate significance.
    • SSEARCH, FASTA (ktup=1), and WU-BLAST2 perform best, detecting most relationships >30% sequence identity.
    • Performance decreases for more distant relationships (20-30% identity), with only half detected.
    • Many distant homologs with low sequence similarity remain undetectable by pairwise methods.

    Conclusions:

    • Statistical scores, particularly E-values from SSEARCH and FASTA, are more reliable for evaluating protein sequence similarity than raw scores or percentage identity.
    • While current pairwise methods are effective for closely related proteins, they have limitations in detecting distant evolutionary relationships.
    • Identified relationships using these methods can be confidently used, but the inability to detect all distant homologs highlights the need for complementary approaches.