Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Prediction Intervals

Prediction Intervals

The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y.

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

End Point Prediction: Gran Plot

End Point Prediction: Gran Plot

A Gran plot is used to predict the equivalence volume or endpoint of a potentiometric or acid-base titration without reaching the endpoint. Typically, titration data is collected as a function of the titrant's volume up to a point less than the equivalence volume and then transformed into a linear format. The straight line is extended to the x-axis, indicating the necessary titrant volume to achieve the equivalence point.
For potentiometric titration, the Gran plot is created by plotting...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Correction: Dementia as a predictor of palliative care: uncovering patient patterns based on German claims data.

BMC palliative care·2025

Same author

Can vigorous physical activity mitigate the effect of systemic inflammation on cognitive performance? Results from a large older community dwelling population in The Netherlands.

Journal of Alzheimer's disease : JAD·2025

Same author

Penalized regression splines in Mixture Density Networks.

The international journal of biostatistics·2025

Same author

Dementia as a predictor of palliative care: Uncovering patient patterns based on German claims data.

BMC palliative care·2025

Same author

Reducing Tinnitus via Inhibitory Influence of the Sensorimotor System on Auditory Cortical Activity.

The Journal of neuroscience : the official journal of the Society for Neuroscience·2025

Same author

The effect of diabetes in the multifaceted relationship between education and cognitive function.

BMC public health·2024

Same journal

Targeted maximum likelihood estimation (TMLE) in regulatory submissions and research: a landscape analysis.

The international journal of biostatistics·2026

Same journal

Predicting birth weight by multivariate functional principal component regressions.

The international journal of biostatistics·2026

Same journal

Robust median regression for count data with general lower truncation using a contaminated discrete Weibull model.

The international journal of biostatistics·2026

Same journal

Handling the uncertainty issue of missingness via a mixture-structure-based method.

The international journal of biostatistics·2026

Same journal

Statistical method for pooling categorical biomarker data from multi-center matched/nested case-control studies.

The international journal of biostatistics·2026

Same journal

Prognostic score methods for the estimation of the average causal effect.

The international journal of biostatistics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 10, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Prediction-based variable selection for component-wise gradient boosting.

Sophie Potts¹, Elisabeth Bergherr¹, Constantin Reinke²

¹Chair of Spatial Data Science and Statistical Learning, University of Goettingen, Goettingen, Germany.

The International Journal of Biostatistics

|November 24, 2023

Summary

This summary is machine-generated.

This study introduces novel prediction-based variable selection methods for gradient boosting, enhancing model accuracy. These new approaches improve data-driven variable selection and prediction performance in statistical modeling.

Keywords:

gradient boosting high-dimensional data prediction analysis sparse models variable selection

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: Jul 10, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Statistics
Machine Learning
Computational Statistics

Background:

Model-based component-wise gradient boosting is widely used for data-driven variable selection.
Existing modifications primarily address stopping criteria, not the core variable selection mechanism.
There is a need for improved prediction and selection qualities in gradient boosting algorithms.

Purpose of the Study:

To investigate and implement novel prediction-based variable selection mechanisms for model-based component-wise gradient boosting.
To evaluate the efficacy of Akaike's Information Criterion (AIC) and cross-validation for variable selection.
To assess the impact of these methods on both variable selection properties and predictive performance.

Main Methods:

Implementation of Akaike's Information Criterion (AIC) for variable selection.
Development and application of a component-wise test error selection rule using cross-validation.
Evaluation using Generalized Linear Models (GLMs) within the gradient boosting framework.
Extensive simulation studies and a real-world data application.

Main Results:

The proposed prediction-based methods demonstrated improved variable selection properties compared to existing approaches.
A reduction in prediction error was observed in a real-world application involving COVID-19 incidence rates.
The cross-validation approach showed particular promise in enhancing model performance.

Conclusions:

Prediction-based variable selection mechanisms offer a significant advancement for model-based component-wise gradient boosting.
The integration of AIC and cross-validation can lead to more accurate and parsimonious models.
These enhanced methods have practical implications for improving statistical modeling and prediction in various fields.