Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Survival Tree01:19

Survival Tree

200
Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
 Building a Survival Tree
Constructing a...
200
Randomized Experiments01:13

Randomized Experiments

8.5K
The randomization process involves assigning study participants randomly to experimental or control groups based on their probability of being equally assigned. Randomization is meant to eliminate selection bias and balance known and unknown confounding factors so that the control group is similar to the treatment group as much as possible. A computer program and a random number generator can be used to assign participants to groups in a way that minimizes bias.
Simple randomization
Simple...
8.5K
Random Variables01:09

Random Variables

16.0K
A random variable is a single numerical value that indicates the outcome of a procedure. The concept of random variables is fundamental to the probability theory and was introduced by a Russian mathematician, Pafnuty Chebyshev, in the mid-nineteenth century.
Uppercase letters such as X or Y denote a random variable. Lowercase letters like x or y denote the value of a random variable. If X is a random variable, then X is written in words, and x is given as a number.
For example, let X = the...
16.0K
Wald-Wolfowitz Runs Test I01:17

Wald-Wolfowitz Runs Test I

782
The Wald-Wolfowitz test, also known as the runs test, is a nonparametric statistical test used to assess the randomness of a sequence of two different types of elements (e.g., positive/negative values, successes/failures). It examines whether the order of the elements in a sequence is random or if there is a pattern or trend present. This nonparametric test applies to any ordered data despite the population and sample data distribution, even if a higher sample size is available.
The test works...
782
Multiple Regression01:25

Multiple Regression

3.4K
Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
3.4K
Regression Toward the Mean01:52

Regression Toward the Mean

6.6K
Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...
6.6K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

When to Adjust for Multiple Testing: A Unifying Guiding Principle.

Biometrical journal. Biometrische Zeitschrift·2026
Same author

HTZ-1/H2A.Z expression sustains transcriptional programs that regulate Caenorhabditis elegans lifespan.

Mechanisms of ageing and development·2026
Same author

Descriptive Analysis of DRESS Reports from EudraVigilance and DRESS Cases from the RegiSCAR-project.

Dermatology (Basel, Switzerland)·2026
Same author

Choroidal thickness on optical coherence tomography as a longitudinal predictor of visual outcomes in intermediate uveitis.

Scientific reports·2026
Same author

Patterns of Spontaneous Adverse Drug Reaction Reporting in Germany From 2012 to 2021.

Pharmacology research & perspectives·2026
Same author

Smartphone-based detection of subtle memory decline in prodromal Alzheimer's disease.

NPJ digital medicine·2026
Same journal

Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

fastkqr: A Fast Algorithm for Kernel Quantile Regression.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Empirical Bayes Covariance Decomposition, and a Solution to the Multiple Tuning Problem in Sparse PCA.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Joint Registration and Conformal Prediction for Partially Observed Functional Data.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Efficient Decision Trees for Tensor Regressions.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
Same journal

Distributed Nonparametric Regression with Heterogeneity Through Prediction-Based Aggregation.

Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America·2026
See all related articles

Related Experiment Video

Updated: Nov 2, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.7K

A Random Forest Approach for Bounded Outcome Variables.

Leonie Weinhold1, Matthias Schmid1, Richard Mitchell2

  • 1Department of Medical Biometry, Informatics and Epidemiology, University of Bonn, Bonn, Germany.

Journal of Computational and Graphical Statistics : a Joint Publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
|June 14, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a novel random forest method for analyzing bounded outcome variables using the beta distribution. This approach accounts for data heteroscedasticity, outperforming traditional methods in simulations and real-world applications.

Keywords:
Beta distributionBounded outcome variablesRandom forestsRegression modeling

More Related Videos

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.3K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.5K

Related Experiment Videos

Last Updated: Nov 2, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.7K
An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

2.3K
Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach
04:35

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

3.5K

Area of Science:

  • Statistics
  • Machine Learning
  • Environmental Science

Background:

  • Random forests are effective for high-dimensional data but struggle with bounded outcomes and heteroscedasticity.
  • Classical regression models based on mean squared error loss are inadequate for data restricted to the unit interval.
  • Existing methods do not effectively handle heteroscedasticity in bounded outcome variables.

Purpose of the Study:

  • To develop a random forest approach tailored for beta-distributed outcome variables.
  • To address the limitations of traditional methods in modeling bounded outcomes with heteroscedasticity.
  • To improve the accuracy and applicability of random forests for specific types of data.

Main Methods:

  • Proposed a novel random forest algorithm utilizing the beta distribution's likelihood function.
  • Split selection in the tree-building process maximizes the log-likelihood of the beta distribution.
  • Incorporated parameter estimates derived from tree nodes into the likelihood maximization.

Main Results:

  • The proposed beta regression random forest method demonstrated superior performance.
  • Effectiveness was shown in simulation studies and an application using the U.S.A. National Lakes Assessment Survey data.
  • Outperformed standard random forest approaches based on mean squared error loss and parametric models.

Conclusions:

  • The beta regression random forest is a valuable tool for analyzing bounded outcome variables.
  • The method effectively handles heteroscedasticity, offering improved modeling capabilities.
  • This approach provides a robust alternative to existing methods for specific ecological and environmental data analysis.