Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

The Tree of Life - Bacteria, Archaea, Eukaryotes

The Tree of Life - Bacteria, Archaea, Eukaryotes

The “tree of life” describes the evolution of life and the evolutionary relationships between organisms. The root of the tree is the common ancestor to all life on Earth. All other species radiate from this point, much like the branches of a tree. The numerous tips of these branches on the tree of life represent every living, or extant, species. Extinct species, which are species that no longer exist, can be found towards the center of the tree. Currently, these organisms, both...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Correlation and Regression

Correlation and Regression

In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Average Acceleration

Average Acceleration

The importance of understanding acceleration spans our day-to-day experiences, as well as the vast reaches of outer space and the tiny world of subatomic physics. In everyday conversation, to accelerate means to speed up. For instance, we are familiar with the acceleration of our car; the harder we apply our foot to the gas pedal, the faster we accelerate. The greater the acceleration, the greater the change in velocity over a given time. Acceleration is widely seen in experimental physics. In...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

The "Domino Effect": Functional Decline and Increased Social Care Requirements Following a Fall.

Journal of the American Medical Directors Association·2026

Same author

DNA Methylation Signatures of Atherosclerosis and Vascular-Related Outcomes in U.S. and Irish Population-Based Cohorts.

medRxiv : the preprint server for health sciences·2026

Same author

An ageing biomarker signature predicts chronic disease cluster trajectories, physical function and mortality: validation in the TILDA and HRS cohorts.

Age and ageing·2026

Same author

Comparing variable selection and model averaging methods for logistic regression.

Proceedings of the National Academy of Sciences of the United States of America·2026

Same author

Bringing Age Back In: Accounting for Population Age Distribution in Forecasting Migration.

Demography·2026

Same author

Molecular Biomarkers for Predicting Treatment Response in Psoriatic Arthritis.

Rheumatic diseases clinics of North America·2026

Same journal

Neural posterior estimation on exponential random graph models: evaluating bias and implementation challenges.

Statistics and computing·2026

Same journal

Subgroup Analysis of Differential Networks with Latent Variables.

Statistics and computing·2026

Same journal

Non-negative matrix factorization algorithms generally improve topic model fits.

Statistics and computing·2026

Same journal

Approximating evidence via bounded harmonic means.

Statistics and computing·2026

Same journal

Efficient Inference in First Passage Time Models.

Statistics and computing·2026

Same journal

Optimal <i>F</i>-score Matching for Bipartite Record Linkage.

Statistics and computing·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Feb 2, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Bayesian Additive Regression Trees using Bayesian Model Averaging.

Belinda Hernández¹, Adrian E Raftery², Stephen R Pennington³

¹School of Mathematics and Statistics, University College Dublin, Ireland.

Statistics and Computing

|November 20, 2018

Summary

This summary is machine-generated.

We introduce BART-BMA, an efficient Bayesian Additive Regression Trees algorithm for high-dimensional data. This method improves computational efficiency for large datasets, making it suitable for bioinformatics applications.

More Related Videos

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

Related Experiment Videos

Last Updated: Feb 2, 2026

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

A Novel Bayesian Change-point Algorithm for Genome-wide Analysis of Diverse ChIPseq Data Types

Published on: December 10, 2012

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

Area of Science:

Computational Statistics
Bioinformatics
Machine Learning

Background:

Bayesian Additive Regression Trees (BART) is a powerful statistical model but can be computationally expensive for high-dimensional datasets (large number of variables, *p*).
Random forests are popular for high-dimensional data but lack probabilistic predictions.
There is a need for efficient algorithms that can handle high-dimensional data and provide probabilistic estimates.

Purpose of the Study:

To propose BART-BMA, a novel fitting algorithm for BART that enhances efficiency in high-dimensional settings.
To leverage Bayesian Model Averaging and greedy search for faster posterior distribution estimation.
To offer a robust model-based approach for analyzing small *n* large *p* datasets.

Main Methods:

Developed BART-BMA, integrating Bayesian Model Averaging with a greedy search strategy.
Employed a combination of BART and random forest principles for improved performance.
Utilized R and Rcpp for implementation, ensuring accessibility and efficiency.

Main Results:

BART-BMA demonstrates significant computational efficiency gains over standard BART for datasets with large *p*.
The algorithm runs in a reasonable time on standard hardware, addressing the "small *n* large *p" challenge.
Successful application in simulated data and real-world proteomic experiments for disease classification.

Conclusions:

BART-BMA provides an efficient and effective solution for analyzing high-dimensional data in bioinformatics.
The method offers a valuable alternative to existing algorithms like BART and random forests.
Open-source code is available, facilitating further research and application.