Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Prediction Intervals01:03

Prediction Intervals

2.3K
The interval estimate of any variable is known as the prediction interval. It helps decide if a point estimate is dependable.
However, the point estimate is most likely not the exact value of the population parameter, but close to it. After calculating point estimates, we construct interval estimates, called confidence intervals or prediction intervals. This prediction interval comprises a range of values unlike the point estimate and is a better predictor of the observed sample value, y. 
2.3K
Multiple Regression01:25

Multiple Regression

3.0K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.0K
Regression Analysis01:11

Regression Analysis

5.7K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
5.7K
Regression Toward the Mean01:52

Regression Toward the Mean

6.3K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.3K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.6K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.6K
End Point Prediction: Gran Plot01:07

End Point Prediction: Gran Plot

342
A Gran plot is used to predict the equivalence volume or endpoint of a potentiometric or acid-base titration without reaching the endpoint. Typically, titration data is collected as a function of the titrant's volume up to a point less than the equivalence volume and then transformed into a linear format. The straight line is extended to the x-axis, indicating the necessary titrant volume to achieve the equivalence point.
For potentiometric titration, the Gran plot is created by plotting...
342

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Correction: Dementia as a predictor of palliative care: uncovering patient patterns based on German claims data.

BMC palliative care·2025
Same author

Can vigorous physical activity mitigate the effect of systemic inflammation on cognitive performance? Results from a large older community dwelling population in The Netherlands.

Journal of Alzheimer's disease : JAD·2025
Same author

Penalized regression splines in Mixture Density Networks.

The international journal of biostatistics·2025
Same author

Dementia as a predictor of palliative care: Uncovering patient patterns based on German claims data.

BMC palliative care·2025
Same author

Reducing Tinnitus via Inhibitory Influence of the Sensorimotor System on Auditory Cortical Activity.

The Journal of neuroscience : the official journal of the Society for Neuroscience·2025
Same author

The effect of diabetes in the multifaceted relationship between education and cognitive function.

BMC public health·2024
Same journal

Targeted maximum likelihood estimation (TMLE) in regulatory submissions and research: a landscape analysis.

The international journal of biostatistics·2026
Same journal

Predicting birth weight by multivariate functional principal component regressions.

The international journal of biostatistics·2026
Same journal

Robust median regression for count data with general lower truncation using a contaminated discrete Weibull model.

The international journal of biostatistics·2026
Same journal

Handling the uncertainty issue of missingness via a mixture-structure-based method.

The international journal of biostatistics·2026
Same journal

Statistical method for pooling categorical biomarker data from multi-center matched/nested case-control studies.

The international journal of biostatistics·2026
Same journal

Prognostic score methods for the estimation of the average causal effect.

The international journal of biostatistics·2026
See all related articles

Related Experiment Video

Updated: Jul 10, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.3K

Prediction-based variable selection for component-wise gradient boosting.

Sophie Potts1, Elisabeth Bergherr1, Constantin Reinke2

  • 1Chair of Spatial Data Science and Statistical Learning, University of Goettingen, Goettingen, Germany.

The International Journal of Biostatistics
|November 24, 2023
PubMed
Summary
This summary is machine-generated.

This study introduces novel prediction-based variable selection methods for gradient boosting, enhancing model accuracy. These new approaches improve data-driven variable selection and prediction performance in statistical modeling.

Keywords:
gradient boostinghigh-dimensional dataprediction analysissparse modelsvariable selection

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

768
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K

Related Experiment Videos

Last Updated: Jul 10, 2025

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients
07:34

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

8.3K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

768
Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.5K

Area of Science:

  • Statistics
  • Machine Learning
  • Computational Statistics

Background:

  • Model-based component-wise gradient boosting is widely used for data-driven variable selection.
  • Existing modifications primarily address stopping criteria, not the core variable selection mechanism.
  • There is a need for improved prediction and selection qualities in gradient boosting algorithms.

Purpose of the Study:

  • To investigate and implement novel prediction-based variable selection mechanisms for model-based component-wise gradient boosting.
  • To evaluate the efficacy of Akaike's Information Criterion (AIC) and cross-validation for variable selection.
  • To assess the impact of these methods on both variable selection properties and predictive performance.

Main Methods:

  • Implementation of Akaike's Information Criterion (AIC) for variable selection.
  • Development and application of a component-wise test error selection rule using cross-validation.
  • Evaluation using Generalized Linear Models (GLMs) within the gradient boosting framework.
  • Extensive simulation studies and a real-world data application.

Main Results:

  • The proposed prediction-based methods demonstrated improved variable selection properties compared to existing approaches.
  • A reduction in prediction error was observed in a real-world application involving COVID-19 incidence rates.
  • The cross-validation approach showed particular promise in enhancing model performance.

Conclusions:

  • Prediction-based variable selection mechanisms offer a significant advancement for model-based component-wise gradient boosting.
  • The integration of AIC and cross-validation can lead to more accurate and parsimonious models.
  • These enhanced methods have practical implications for improving statistical modeling and prediction in various fields.