Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Integrated flexible DNA methylation-chromatin segmentation modeling enhances epigenomic state annotation.

Nucleic acids research·2026
Same author

Numerical estimation of limiting large-deviation rate functions.

Physical review. E·2026
Same author

Rare events of host switching for diseases using a susceptible-infected-recovered model with mutations.

Physical review. E·2026
Same author

Cleanifier: contamination removal from microbial sequences using spaced seeds of a human pangenome index.

Bioinformatics (Oxford, England)·2025
Same author

Distribution of the Number of Paths in Two-Dimensional Directed Percolation.

Entropy (Basel, Switzerland)·2025
Same author

Diffusion with stochastic resetting on a lattice.

Physical review. E·2025
Same journal

OpenIMC: an open-source platform for analyzing single-cell and spatial proteomics by imaging mass cytometry.

BMC bioinformatics·2026
Same journal

NAP: an open source pipeline for cross-domain microbiome profiling using Nanopore sequencing-derived amplicon data.

BMC bioinformatics·2026
Same journal

SurvGME: an R package for survival analysis with graphical and measurement error models.

BMC bioinformatics·2026
Same journal

SimMapNet: a Bayesian framework for gene regulatory network inference using gene ontology similarities as external hint.

BMC bioinformatics·2026
Same journal

Dual channel drug-drug interactions extraction based on cross attention.

BMC bioinformatics·2026
Same journal

FeSseqdb: a curated sequence-level database and interpretable machine learning framework for identifying iron-sulfur proteins.

BMC bioinformatics·2026
See all related articles

Related Experiment Video

Updated: Jun 4, 2026

Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

Accurate statistics for local sequence alignment with position-dependent scoring by rare-event sampling.

Stefan Wolfsheimer1, Inke Herms, Sven Rahmann

  • 1Laboratoire MAP5 (UMR CNRS 8145), Université Paris Descartes, Paris, France. stefan.wolfsheimer@googlemail.com

BMC Bioinformatics
|February 5, 2011
PubMed
Summary
This summary is machine-generated.

This study introduces a novel computational method for accurately calculating molecular database search score distributions. The new approach improves sensitivity and specificity, especially for complex protein families like transmembrane proteins.

More Related Videos

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation
16:02

Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation

Published on: February 10, 2023

Related Experiment Videos

Last Updated: Jun 4, 2026

Rare Event Detection Using Error-corrected DNA and RNA Sequencing
10:36

Rare Event Detection Using Error-corrected DNA and RNA Sequencing

Published on: August 3, 2018

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER
14:06

Detection of Rare Genomic Variants from Pooled Sequencing Using SPLINTER

Published on: June 23, 2012

Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation
16:02

Demonstration of the Sequence Alignment to Predict Across Species Susceptibility Tool for Rapid Assessment of Protein Conservation

Published on: February 10, 2023

Area of Science:

  • Bioinformatics
  • Computational Biology
  • Statistical Modeling

Background:

  • Classical statistical models for molecular database searches assume independent and identically distributed (i.i.d.) sequences, which are often inappropriate for real-world applications.
  • Existing models struggle with position-dependent scoring schemes, Hidden Markov Models (HMMs), and non-i.i.d. sequence properties, limiting search sensitivity and specificity.
  • The statistical properties of these more complex scenarios remain underexplored, hindering advancements in homology search tools.

Purpose of the Study:

  • To develop an efficient and general method for computing score distributions in molecular database searches with high accuracy.
  • To evaluate the performance of this method for various sequence models and similarity measures, particularly for non-i.i.d. sequences like transmembrane proteins.
  • To compare the effectiveness of position-dependent scoring and HMMs against classical approaches for improved search sensitivity and specificity.

Main Methods:

  • Utilized rare-event simulation techniques, including Markov chain Monte Carlo (MCMC) simulations, importance sampling, and generalized ensembles.
  • Developed a method to accurately compute the score distribution, focusing on the tail region relevant for practical applications.
  • Applied the method to score statistics of fixed and random queries against random sequences, and extended it to a transmembrane protein model.

Main Results:

  • Successfully computed score distributions to desired accuracy, providing access to the low-probability region of significant scores.
  • Demonstrated the method's applicability to different sequence models and similarity measures under weak assumptions.
  • Showcased improved statistical analysis for transmembrane proteins using position-dependent scoring and HMMs compared to classical methods.

Conclusions:

  • Sensitivity and specificity in molecular database searches are highly dependent on the chosen scoring and sequence models.
  • The developed method offers a robust framework for analyzing score distributions in complex biological sequence data.
  • ROC analysis confirmed the superior performance of advanced models for transmembrane protein searches.