Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a survival tree begins...

Probability Distributions

Probability Distributions

The probability of a random variable x is the likelihood of its occurrence. A probability distribution represents the probabilities of a random variable using a formula, graph, or table. There are two types of probability distribution– discrete probability distribution and continuous probability distribution.
A discrete probability distribution is a probability distribution of discrete random variables. It can be categorized into binomial probability distribution and Poisson probability...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Binomial Probability Distribution

Binomial Probability Distribution

A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...

Probability Histograms

Probability Histograms

A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.

Probability in Statistics

Probability in Statistics

Probability is the likelihood of an event occurring. The term event is defined as a collection of results of a procedure. An event is a simple event when an outcome cannot be divided into simpler parts.
An example of a simple event is a coin toss. The result of a coin toss is either a head or a tail. Here, head and tail are two simple events. These two simple events make up the sample space. Further, the probability of an event occurring falls within the range of 0 to 1. The probability of an...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Phospho-SXXE/D motif mediated TNF receptor 1-TRADD death domain complex formation for T cell activation and migration.

Journal of immunology (Baltimore, Md. : 1950)·2011

Same author

Morphology-dependent field emission properties and wetting behavior of ZnO nanowire arrays.

Nanoscale research letters·2011

Same author

Fabrication and magnetic properties of granular Co/porous InP nanocomposite materials.

Nanoscale research letters·2011

Same author

Comparative permeabilities of the paracellular and transcellular pathways of corneal endothelial layers.

The Journal of membrane biology·2011

Same author

In vitro characterization of the metabolic pathways and cytochrome P450 inhibition and induction potential of BMS-690514, an ErbB/vascular endothelial growth factor receptor inhibitor.

Drug metabolism and disposition: the biological fate of chemicals·2011

Same author

Evaluation of primary HPV-DNA testing in relation to visual inspection methods for cervical cancer screening in rural China: an epidemiologic and cost-effectiveness modelling study.

BMC cancer·2011

Same journal

Classification Under Local Differential Privacy with Model Reversal and Model Averaging.

Journal of machine learning research : JMLR·2026

Same journal

Sparse Semiparametric Discriminant Analysis for High-dimensional Zero-inflated Data.

Journal of machine learning research : JMLR·2026

Same journal

Heterogeneity-aware Clustered Distributed Learning for Multi-source Data Analysis.

Journal of machine learning research : JMLR·2026

Same journal

A Two-Stage Penalized Least Squares Method for Constructing Large Systems of Structural Equations.

Journal of machine learning research : JMLR·2026

Same journal

Bayesian Multinomial Logistic Normal Models through Marginally Latent Matrix-T Processes.

Journal of machine learning research : JMLR·2026

Same journal

Multi-source Learning via Completion of Block-wise Overlapping Noisy Matrices.

Journal of machine learning research : JMLR·2026

See all related articles

Search research articles

Related Experiment Videos

Unsupervised Tree Boosting for Learning Probability Distributions.

Naoki Awaya¹, Li Ma²

¹School of Political Science and Economics, Waseda University, Shinjuku City, Tokyo 169-8050, Japan.

Journal of Machine Learning Research : JMLR

|June 11, 2026

Summary

This summary is machine-generated.

We developed an unsupervised tree boosting algorithm to infer sampling distributions. This method uses novel distribution addition and residualization, achieving competitive performance in density estimation.

Keywords:

additive models density estimation ensemble methods generative models normalizing flows recursive partitioning

Related Experiment Videos

Area of Science:

Machine Learning
Statistical Modeling
Probability Theory

Background:

Supervised tree boosting is a powerful ensemble method.
Density estimation is crucial for understanding data distributions.
Existing methods for unsupervised density estimation have limitations.

Purpose of the Study:

To propose a novel unsupervised tree boosting algorithm for inferring sampling distributions.
To introduce new concepts of distribution addition and residualization.
To enable accurate multivariate density estimation without labeled data.

Main Methods:

Developed an unsupervised tree boosting algorithm using additive tree ensembles.
Introduced a new definition of multivariate cumulative distribution function (CDF) to enable distribution addition and residualization.
Employed forward-stagewise fitting to minimize Kullback-Leibler divergence.
Incorporated scale-dependent shrinkage and a two-stage marginal-copula fitting strategy.

Main Results:

The algorithm successfully infers underlying sampling distributions from i.i.d. samples.
Demonstrated the effectiveness of distribution addition and residualization for univariate and multivariate settings.
Achieved competitive performance against state-of-the-art deep learning methods in multivariate density estimation.
The algorithm provides analytic density evaluation and a generative model.

Conclusions:

The proposed unsupervised tree boosting algorithm offers a novel approach to density estimation.
The new multivariate CDF definition facilitates advanced distributional operations.
This method shows promise for complex data analysis and generative modeling.