Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Survival Tree

Survival Tree

Survival trees are a non-parametric method used in survival analysis to model the relationship between a set of covariates and the time until an event of interest occurs, often referred to as the "time-to-event" or "survival time." This method is particularly useful when dealing with censored data, where the event has not occurred for some individuals by the end of the study period, or when the exact time of the event is unknown.
Building a Survival Tree
Constructing a...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for k_a Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...

Probability Histograms

Probability Histograms

A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

MulticlusterKDE: a new algorithm for clustering based on multivariate kernel density estimation.

Journal of applied statistics·2022

Same journal

Elastic functional Cox regression model with shape predictors.

Journal of applied statistics·2026

Same journal

An improved two-stage binary relevance method for multilabel classification.

Journal of applied statistics·2026

Same journal

Classification of multivariate functional data with an application to ADHD fMRI data.

Journal of applied statistics·2026

Same journal

Assessing the performance of longitudinal T-lymphocytes as biomarkers of immune recovery in HIV-infected children with or without TB co-infection.

Journal of applied statistics·2026

Same journal

Sparse long-only Markowitz portfolio optimization.

Journal of applied statistics·2026

Same journal

Homogeneity of multinomial populations when data are classified into a large number of groups.

Journal of applied statistics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 2, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

TreeKDE: clustering multivariate data based on decision tree and using one-dimensional kernel density estimation.

D Scaldelai¹, L C Matioli², S R Santos¹

¹Colegiado de Matemática, Universidade Estadual do Paraná-UNESPAR, Campo Mourão, Brazil.

Journal of Applied Statistics

|February 28, 2024

Summary

This summary is machine-generated.

We introduce TreeKDE, a novel algorithm for multidimensional data clustering. This method efficiently identifies data clusters and their boundaries using decision trees and kernel density estimation.

Keywords:

Decision tree Gaussian kernel clustering data kernel density estimation optimization method

More Related Videos

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Published on: February 25, 2013

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Related Experiment Videos

Last Updated: Jul 2, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Trajectory Data Analyses for Pedestrian Space-time Activity Study

Published on: February 25, 2013

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Area of Science:

Computer Science
Data Science
Machine Learning

Background:

Clustering multidimensional data is a fundamental task in data analysis.
Existing algorithms may struggle with automatic cluster number determination and defining cluster boundaries.
There is a need for efficient and robust clustering methods.

Purpose of the Study:

To present a new algorithm, TreeKDE, for clustering multidimensional data.
To demonstrate the algorithm's capability for automatic cluster number determination.
To showcase its efficiency and competitiveness against existing methods.

Main Methods:

The TreeKDE algorithm utilizes a decision tree structure.
It optimizes a one-dimensional kernel density estimator (KDE).
Orthogonal projections of data onto coordinate axes are employed.

Main Results:

TreeKDE automatically determines the number of clusters.
It effectively defines cluster boundaries within rectangular regions.
Comparative experiments show TreeKDE is efficient and competitive.

Conclusions:

TreeKDE offers a simple and efficient approach to data clustering.
The algorithm shows promise for further research and development.
It provides a foundation for new clustering algorithms combining decision trees and KDE.