Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Fast probabilistic analysis of sequence function using scoring matrices.

T D Wu1, C G Nevill-Manning, D L Brutlag

  • 1Department of Biochemistry, Stanford University School of Medicine, Stanford, CA, USA. twu@gene.com

Bioinformatics (Oxford, England)
|June 27, 2000
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Disrupted Iron Storage in Dental Fluorosis.

Journal of dental research·2019
Same author

Kinetics of intracolloidal iodine in thyroid of iodine-deficient or equilibrated newborn rats. Direct imaging using secondary ion mass spectrometry.

Cellular and molecular biology (Noisy-le-Grand, France)·2008
Same author

Study of the localization of iron, ferritin, and hemosiderin in Alzheimer's disease hippocampus by analytical microscopy at the subcellular level.

Journal of structural biology·2005
Same author

Expression of vascular endothelial growth factor, hypoxia inducible factor 1alpha, and carbonic anhydrase IX in human tumours.

Journal of clinical pathology·2004
Same author

Automated construction of structural motifs for predicting functional sites on protein structures.

Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing·2003
Same author

Bioinformatics in the post-genomic era.

Trends in biotechnology·2001
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

We developed new techniques to speed up sequence analysis using scoring matrices by calculating quantile functions and allowing users to set probability (p) thresholds. These methods significantly increase analysis speed for large-scale sequencing projects.

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Genomics

Background:

  • Sequence analysis relies on scoring matrices, but speed limitations hinder large-scale applications.
  • Calculating the quantile function for scoring matrices provides a probability (p) value for segmental scores.
  • User-defined p thresholds enable a balance between sensitivity and speed in sequence analysis.

Purpose of the Study:

  • To present novel techniques for accelerating sequence analysis using scoring matrices.
  • To enable wider application of scoring matrices in large-scale sequencing and annotation.
  • To offer a tunable trade-off between analysis speed and sensitivity.

Main Methods:

  • Developed three speed-enhancing techniques: probability filtering, lookahead scoring, and permuted lookahead scoring.

Related Experiment Videos

  • Probability filtering uses a score threshold derived from the p threshold to reduce segments.
  • Lookahead scoring techniques test intermediate scores and optimize segment scoring order for early termination.
  • Main Results:

    • Achieved significant reductions in examined residues, ranging from 62% to 6% based on p threshold.
    • Demonstrated sequence analysis speeds several times faster than existing programs, reaching 225 residues/s (p=10^-6) and 541 residues/s (p=10^-20).
    • Evaluated the impact of independence and Markov assumptions on p-value calculations, with Markov assumptions generally increasing p-values.

    Conclusions:

    • The developed techniques substantially increase sequence analysis speed with scoring matrices.
    • These methods facilitate the broader use of scoring matrices in large-scale bioinformatics.
    • The EMATRIX software package implements these techniques and is available for academic and commercial use.