Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Experiment Videos

Combining multiple data sets in a likelihood analysis: which models are the best?

Tal Pupko1, Dorothée Huchon, Ying Cao

  • 1The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106-8569, Japan. tal@ism.ac.jp

Molecular Biology and Evolution
|November 26, 2002
PubMed
Summary
This summary is machine-generated.

Related Concept Videos

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

BetaDescribe: Providing rich descriptions from protein sequences.

Proceedings of the National Academy of Sciences of the United States of America·2026
Same author

Intron Retention as a Homeostatic State Variable for Drug Response and Recovery: Lessons from Depression for Broader Applications.

International journal of molecular sciences·2026
Same author

The IR-Homeostat Hypothesis: Intron Retention as an Evolutionarily Conserved Fine-Tuning Layer and a Reversible Blood Biomarker of Homeostatic Dysregulation in Mood Disorders.

International journal of molecular sciences·2026
Same author

The role of plant polyploidy in the structure of plant-pollinator communities.

Frontiers in plant science·2026
Same author

Efficient algorithms for simulating sequences along a phylogenetic tree.

Bioinformatics (Oxford, England)·2025
Same author

Anesthetic Management of a Dental Patient With Familial Mediterranean Fever.

Anesthesia progress·2025
Same journal

Population Epigenetics: Deciphering DNA Methylation Diversity and its Implications for Health, Disease, and Evolution.

Molecular biology and evolution·2026
Same journal

Genomic signature of repeated transitions to diurnality in spiders.

Molecular biology and evolution·2026
Same journal

Phylogenomic blind spots: The limits of UCE and BUSCO loci in the presence of gene flow.

Molecular biology and evolution·2026
Same journal

seqLens: Optimizing Language Models for Genomic Predictions.

Molecular biology and evolution·2026
Same journal

The transcriptional and translational outcomes for pseudogenes in bacterial endosymbionts.

Molecular biology and evolution·2026
Same journal

800 million years of co-evolution in the green plant lineage - the case of LEUNIG and SEUSS transcriptional co-regulators.

Molecular biology and evolution·2026
See all related articles

Statistical models for combining multiple gene sequences in phylogenetic analysis are crucial. The study found that separate or proportional models for branch lengths and individual gamma parameters for among-site rate variation best represent molecular data, impacting tree topology accuracy.

Area of Science:

  • Evolutionary Biology
  • Bioinformatics
  • Computational Biology

Background:

  • Phylogenetic analyses traditionally used single gene sequences.
  • The availability of vast gene sequence data necessitates multi-gene analyses.
  • Combining multiple molecular datasets requires robust statistical methods.

Purpose of the Study:

  • To compare statistical models for combining different genes in phylogenetic analyses.
  • To evaluate the likelihood of tree topologies using various branch length and rate variation models.
  • To determine the impact of model choice on maximum likelihood phylogenetic inference.

Main Methods:

  • Compared three branch length estimation methods: concatenate, proportional, and separate models.
  • Compared three models of among-site rate variation: homogenous, single gamma parameter, and separate gamma parameters per gene.

Related Experiment Videos

  • Utilized two nuclear and one mitochondrial amino acid data sets for analysis.
  • Main Results:

    • The separate or proportional models for branch lengths were most appropriate, depending on the dataset.
    • A model with one gamma parameter for each gene was optimal for among-site rate variation across all datasets.
    • Model choice significantly influenced the resulting maximum likelihood tree topology.

    Conclusions:

    • The selection of appropriate statistical models for combining molecular data is critical for accurate phylogenetic reconstruction.
    • Specific models for branch length estimation and among-site rate variation improve the reliability of evolutionary trees.
    • Understanding model effects is essential for interpreting phylogenetic results, particularly for complex datasets like mammalian phylogenies.