相关概念视频
Bootstrapping
584
The term "bootstrap" originated in the 19th century as a metaphor for self-improvement or achieving something independently, without external assistance. This concept extends to statistical bootstrapping, a self-contained method for estimating population parameters through resampling, even though it can be computationally intensive. Developed by the American statistician Dr. Bradley Efron in 1979, bootstrapping provides a robust way to perform inference when the original sample size is...
584
Evolutionary Relationships through Genome Comparisons
5.7K
Genome comparison is one of the excellent ways to interpret the evolutionary relationships between organisms. The basic principle of genome comparison is that if two species share a common feature, it is likely encoded by the DNA sequence conserved between both species. The advent of genome sequencing technologies in the late 20th century enabled scientists to understand the concept of conservation of domains between species and helped them to deduce evolutionary relationships across diverse...
5.7K
Prediction Intervals
2.2K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.
2.2K
Survival Tree
61
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...
Building a Survival Tree
Constructing a...
61
Multiple Regression
2.9K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
2.9K
Regression Analysis
5.6K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
5.6K
您也可能阅读
相关文章
通过共同作者、期刊和引用图与本文相关的文章。
排序
Same author
A systematic exploration of current limitations of cognate-based phylogenetic inference.
Open research Europe·2026
Same author
Performance assessment of phylogenetic inference tools using PhyloSmew.
Bioinformatics advances·2025
Same author
raxtax: a k-mer-based non-Bayesian taxonomic classifier.
Bioinformatics (Oxford, England)·2025
Same author
Accelerating Maximum Likelihood Phylogenetic Inference via Early Stopping to Evade (Over-)optimization.
Systematic biology·2025
Same author
Pandora: a tool to estimate dimensionality reduction stability of genotype data.
Bioinformatics advances·2025
Same author
Read Length Dominates Phylogenetic Placement Accuracy of Ancient DNA Reads.
Molecular biology and evolution·2025
Same journal
Population Epigenetics: Deciphering DNA Methylation Diversity and its Implications for Health, Disease, and Evolution.
Molecular biology and evolution·2026
Same journal
Genomic signature of repeated transitions to diurnality in spiders.
Molecular biology and evolution·2026
Same journal
Phylogenomic blind spots: The limits of UCE and BUSCO loci in the presence of gene flow.
Molecular biology and evolution·2026
Same journal
seqLens: Optimizing Language Models for Genomic Predictions.
Molecular biology and evolution·2026
Same journal
The transcriptional and translational outcomes for pseudogenes in bacterial endosymbionts.
Molecular biology and evolution·2026
Same journal
800 million years of co-evolution in the green plant lineage - the case of LEUNIG and SEUSS transcriptional co-regulators.
Molecular biology and evolution·2026
相关实验视频
Updated: Jun 10, 2025

12:00
A Practical Guide to Phylogenetics for Nonexperts
Published on: February 5, 2014
35.3K
通过机器学习预测遗传学引导值.
Julius Wiegert1, Dimitri Höhler1, Julia Haag1
1Computational Molecular Evolution Group, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.
Molecular biology and evolution
|October 17, 2024
概括
我们介绍了受过教育的引导猜测器 (EBG),这是一种机器学习工具,可以快速预测家族遗传树枝支值. EBG提供了一种更快,更准确的替代标准启动方法,提高了家族遗传学分析的效率.
科学领域:
- 人类遗传学和进化生物学.
- 计算生物学和生物信息学
- 机器学习在生物信息学中的应用.
背景情况:
- 估计家族遗传树的统计强度对于可靠的进化推断至关重要.
- 标准非参数的 Felsenstein 引导支持 (SBS) 是计算密集型的,导致开发更快的近似方法.
- 现有的更快的方法,如快速启动 (RB),SH-aLRT和超快速启动 2 (UFBoot2),有局限性,包括计算成本,模型违规评估需求或低支持范围的不稳定性.
研究的目的:
- 开发一种基于机器学习的工具,即受过教育的引导猜测器 (EBG),用于预测标准的,非参数的Felsenstein引导支持 (SBS) 值.
- 提供一个计算效率高,准确的方法来评估遗传学分支的支持.
- 为支行支持预测提供不确定性指标,以提高解释性.
主要方法:
- 开发了一种机器学习模型 (EBG),训练它从基因树数据中预测SBS值.
- 在速度和准确性方面,比较EBG与UFBoot2等现有方法.
- 使用中位数绝对误差和不确定性量化的评估来评估预测准确性.
主要成果:
- 平均而言,EBG比UFBoot2.4快9.4 (σ=5.5) 倍.
- 对于0至100之间的SBS值,EBG实现了5的中位数绝对误差.
- 在标准硬件上,EBG可以在几个小时内预测大型族系 (例如1654个序列) 的支持值.
- EBG为每个预测的分支机构支持价值提供不确定性估计.
结论:
- EBG在计算效率方面取得了显著的进步,用于遗传学分析.
- 该工具提供准确的SBS预测与有价值的不确定性量化,促进更强大的解释.
- 通过对可访问的计算资源进行快速分析,EBG使强大的基因推理民主化.

