It's all relative: Regression analysis with compositional predictors | JoVE Visualize

Area of Science:

Statistics
Bioinformatics
Computational Biology

Background:

Compositional data, representing fractions of a whole, are common in various scientific fields.
Existing regression methods using log-ratio transformations struggle with high-dimensional data, excessive zeros, and hierarchical structures.
These traditional models often lack clear interpretability due to the inherent interrelations within compositional parts.

Purpose of the Study:

To develop a novel relative-shift regression framework for analyzing compositional data directly.
To provide a more interpretable model for understanding the impact of shifting part concentrations on a response variable.
To introduce advanced regularization techniques and efficient algorithms for feature selection and dimension reduction in compositional regression.

Main Methods:

Developed a relative-shift regression framework that utilizes proportions as direct predictors.
Introduced equi-sparsity and tree-guided regularization methods for feature aggregation and dimension reduction.
Implemented an efficient smoothing proximal gradient algorithm for model estimation.
Derived a unified finite-sample prediction error bound for the proposed regularized estimators.

Main Results:

The proposed framework demonstrated superior interpretability compared to traditional log-ratio methods.
Simulation studies confirmed the efficacy and robustness of the new regression approach.
Application to a gut microbiome dataset successfully identified key taxa associated with preterm infant neurodevelopment at various taxonomic levels.

Conclusions:

The relative-shift regression framework offers a significant advancement for analyzing compositional data, particularly in high-dimensional and complex biological systems.
The developed regularization methods and algorithm facilitate practical application and reliable inference.
This approach provides valuable insights into the relationship between microbiome composition and infant neurodevelopment, highlighting its potential in biomedical research.