Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Using evolutionary Expectation Maximization to estimate indel rates.

Ian Holmes1

  • 1Department of Statistics, 1 South Parks Road, Oxford OX1 3TG, UK. ihh@berkeley.edu

Bioinformatics (Oxford, England)
|February 26, 2005
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Apollo 3: Multi-Species Genome Curation.

bioRxiv : the preprint server for biology·2026
Same author

Insertions, deletions, and exchangeable couplings: a Dirichlet process over TKF92 domains and sites.

bioRxiv : the preprint server for biology·2026
Same author

Nested birth-death processes are competitive with parameter-heavy neural networks as time-dependent models of protein evolution.

bioRxiv : the preprint server for biology·2026
Same author

Proteins in the Genome Browser: Integration of Phylogenies, Alignments, and Structures With Nucleotide-level Evidence in JBrowse 2.

Journal of molecular biology·2026
Same author

Setting up JBrowse 2 for Visualizing Genome Synteny.

Current protocols·2025
Same author

Selective State Space Models Outperform Transformers at Predicting RNA-Seq Read Coverage.

bioRxiv : the preprint server for biology·2025
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

This study introduces an Expectation Maximization (EM) algorithm for estimating insertion and deletion (indel) rates in biological sequences. The new method accurately estimates indel rates, improving upon existing techniques for evolutionary modeling.

Area of Science:

  • Computational Biology
  • Bioinformatics
  • Evolutionary Biology

Background:

  • The Expectation Maximization (EM) algorithm, including Baum-Welch and Inside-Outside algorithms, is crucial for estimating parameters in stochastic grammars for biological sequence analysis.
  • Extending EM to estimate evolutionary model mutation rates, particularly for phylogenetic tree-based stochastic grammars (Statistical Alignment), is essential for comprehensive evolutionary modeling.
  • Previous work extended EM to substitution rates; this study addresses the indel process.

Purpose of the Study:

  • To develop and present an algorithm for maximum-likelihood estimation of insertion and deletion (indel) rates from multiple sequence alignments.
  • To apply the Expectation Maximization (EM) algorithm to the Thorne, Kishino, and Felsenstein (TKF91) single-residue indel model.
  • To extend evolutionary modeling capabilities to include indel processes alongside substitution processes.

Related Experiment Videos

Main Methods:

  • Developed a novel algorithm for maximum-likelihood estimation of indel rates using the EM algorithm.
  • Applied the algorithm to the Thorne, Kishino, and Felsenstein (TKF91) single-residue indel model.
  • Utilized simulated and experimental biological sequence data for validation.

Main Results:

  • The algorithm demonstrates extremely rapid convergence and provides accurate maximum-likelihood estimates of indel rates.
  • Results on simulated data show improvement over parsimonious estimates, which tend to underestimate indel rates.
  • Plausible results were obtained on experimental data, such as coronavirus envelope domains.
  • The algorithm's similarity to the Baum-Welch algorithm allows for unsupervised rate estimation for unaligned or heterogeneously-rated sequences.

Conclusions:

  • The presented EM-based algorithm effectively estimates indel rates from multiple sequence alignments.
  • This method offers an improvement over existing techniques and provides accurate, rapid estimations for evolutionary modeling.
  • The algorithm's flexibility enables unsupervised and heterogeneous rate estimation, broadening its applicability in bioinformatics.