Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Regression Analysis01:11

Regression Analysis

6.1K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
6.1K
Regression Toward the Mean01:52

Regression Toward the Mean

6.5K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.5K
Multiple Regression01:25

Multiple Regression

3.2K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.2K
Survival Tree01:19

Survival Tree

167
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
167
Correlation and Regression00:53

Correlation and Regression

2.1K
In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...
2.1K
Residual Plots01:07

Residual Plots

5.1K
A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.
When the residual values are plotted against the variable x, it is called a residual...
5.1K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Clinical Predictors of Nose/throat Bacteriome and Fungal Colonization in Skilled Nursing Facility Residents.

The Journal of infectious diseasesĀ·2026
Same author

Using transfer learning to improve prediction of suicide risk in acute care hospitals.

Journal of the American Medical Informatics Association : JAMIAĀ·2025
Same author

BAMITA: Bayesian multiple imputation for tensor arrays.

Biostatistics (Oxford, England)Ā·2024
Same author

ADAPT: Analysis of Microbiome Differential Abundance by Pooling Tobit Models.

Bioinformatics (Oxford, England)Ā·2024
Same author

Accuracy and transportability of machine learning models for adolescent suicide prediction with longitudinal clinical records.

Translational psychiatryĀ·2024
Same author

Gut microbial diversity and functional characterization in people with alcohol use disorder: A case-control study.

PloS oneĀ·2024
Same journal

Fast penalized generalized estimating equationsĀ for large longitudinal functional datasets.

BiometricsĀ·2026
Same journal

Causally-interpretable random-effects meta-analysis.

BiometricsĀ·2026
Same journal

Statistical inference for mean function of partially observed functional time series.

BiometricsĀ·2026
Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

BiometricsĀ·2026
Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

BiometricsĀ·2026
Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

BiometricsĀ·2026
See all related articles

Related Experiment Video

Updated: Sep 22, 2025

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

9.3K

It's all relative: Regression analysis with compositional predictors.

Gen Li1, Yan Li1, Kun Chen2

  • 1Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor., Michigan, USA.

Biometrics
|May 26, 2022
PubMed
Summary
This summary is machine-generated.

This study introduces a new regression framework for compositional data, offering better interpretation and handling of complex datasets like the gut microbiome. The method aids in understanding how part concentrations affect outcomes, crucial for fields like infant neurodevelopment research.

Keywords:
equi-sparsityfeature aggregationmicrobiomerelative shifttree-guided regularization

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K
Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.3K

Related Experiment Videos

Last Updated: Sep 22, 2025

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

9.3K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.3K
Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.3K

Area of Science:

  • Statistics
  • Bioinformatics
  • Computational Biology

Background:

  • Compositional data, representing fractions of a whole, are common in various scientific fields.
  • Existing regression methods using log-ratio transformations struggle with high-dimensional data, excessive zeros, and hierarchical structures.
  • These traditional models often lack clear interpretability due to the inherent interrelations within compositional parts.

Purpose of the Study:

  • To develop a novel relative-shift regression framework for analyzing compositional data directly.
  • To provide a more interpretable model for understanding the impact of shifting part concentrations on a response variable.
  • To introduce advanced regularization techniques and efficient algorithms for feature selection and dimension reduction in compositional regression.

Main Methods:

  • Developed a relative-shift regression framework that utilizes proportions as direct predictors.
  • Introduced equi-sparsity and tree-guided regularization methods for feature aggregation and dimension reduction.
  • Implemented an efficient smoothing proximal gradient algorithm for model estimation.
  • Derived a unified finite-sample prediction error bound for the proposed regularized estimators.

Main Results:

  • The proposed framework demonstrated superior interpretability compared to traditional log-ratio methods.
  • Simulation studies confirmed the efficacy and robustness of the new regression approach.
  • Application to a gut microbiome dataset successfully identified key taxa associated with preterm infant neurodevelopment at various taxonomic levels.

Conclusions:

  • The relative-shift regression framework offers a significant advancement for analyzing compositional data, particularly in high-dimensional and complex biological systems.
  • The developed regularization methods and algorithm facilitate practical application and reliable inference.
  • This approach provides valuable insights into the relationship between microbiome composition and infant neurodevelopment, highlighting its potential in biomedical research.