Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Chi-square Distribution

Chi-square Distribution

How does one determine if bingo numbers are evenly distributed or if some numbers occurred with a greater frequency? Or if the types of movies people preferred were different across different age groups or if a coffee machine dispensed approximately the same amount of coffee each time. These questions can be addressed by conducting a hypothesis test. One distribution that can be used to find answers to such questions is known as the chi-square distribution. The chi-square distribution has...

Probability Histograms

Probability Histograms

A probability histogram is a visual representation of a probability distribution. Similar a typical histogram, the probability histogram consists of contiguous (adjoining) boxes. It has both a horizontal axis and a vertical axis. The horizontal axis is labeled with what the data represents. The vertical axis is labeled with probability. Each rectangular bar in the histogram is 1 unit wide, which suggests that the area under each bar equals the probability, P(x), where x is 1, 2, 3, and so on.

Binomial Probability Distribution

Binomial Probability Distribution

A binomial distribution is a probability distribution for a procedure with a fixed number of trials, where each trial can have only two outcomes.
The outcomes of a binomial experiment fit a binomial probability distribution. A statistical experiment can be classified as a binomial experiment if the following conditions are met:
There are a fixed number of trials. Think of trials as repetitions of an experiment. The letter n denotes the number of trials.
There are only two possible outcomes,...

Relative Frequency Histogram

Relative Frequency Histogram

The relative frequency depicts the proportion of data points that have each value. The frequency tells the number of data points that have each value. Like the histogram, a relative frequency histogram also has the same shape with a horizontal scale (the x-axis), but the vertical scale (the y-axis) is marked with relative frequencies (percentages of the whole) instead of actual frequencies. A relative frequency histogram is a graphical representation of a frequency distribution where the...

Probability Distributions

Probability Distributions

The probability of a random variable x is the likelihood of its occurrence. A probability distribution represents the probabilities of a random variable using a formula, graph, or table. There are two types of probability distribution– discrete probability distribution and continuous probability distribution.
A discrete probability distribution is a probability distribution of discrete random variables. It can be categorized into binomial probability distribution and Poisson...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Causal Effect Estimation With TMLE: Handling Missing Data and Near Violations of Positivity.

Biometrical journal. Biometrische Zeitschrift·2026

Same author

In Vitro Study to Evaluate the Antibacterial Effect of an Oxidising Agent on Ex Vivo Biofilm.

Oral health & preventive dentistry·2026

Same author

Local and global mortality experience: A novel hierarchical model for regional mortality risk.

PloS one·2026

Same author

Glucose-6-Phosphatase-Dehydrogenase activity as modulative association between Parkinson's disease and periodontitis.

Frontiers in cellular and infection microbiology·2024

Same author

The Impact of Implant Abutment Angle and Height on Peri-implant Tissue Health: Retrospective Analyses from a Randomized Controlled Clinical Trial.

The International journal of prosthodontics·2024

Same author

Association between Average Vitamin D Levels and COVID-19 Mortality in 19 European Countries-A Population-Based Study.

Nutrients·2023

Same journal

Invaders taking over-Mollusc faunal change in volcanic barrier lakes of the Albertine Rift biodiversity hotspot.

PloS one·2026

Same journal

AI-driven molecular diversification and ligand-based optimization of macitentan derivatives targeting VEGFR1 and endothelin signaling pathways.

PloS one·2026

Same journal

Performance patterns and records in the world aquatics masters championships: Where do the most frequently represented nations among the top-ten masters swimmers come from?

PloS one·2026

Same journal

Modeling diurnal Temperature-Rainfall relationships under multicollinearity using PLS-SEM: A case study of Ghana.

PloS one·2026

Same journal

Organizational culture, social capital, and emergency capacity in primary healthcare institutions: A cross-sectional structural equation modeling study comparing ordinary and older communities.

PloS one·2026

Same journal

Impact of kidney function on the metabolome in the general population.

PloS one·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 22, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Clustering compositional data using Dirichlet mixture model.

Samyajoy Pal¹, Christian Heumann¹

¹Department of Statistics, LMU Munich, Munich, Bayern, Germany.

|May 18, 2022

Summary

This summary is machine-generated.

This study introduces a novel Dirichlet mixture model for compositional data analysis, avoiding data transformations. The method effectively clusters complex datasets, outperforming existing techniques in simulations and real-world applications.

More Related Videos

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Determination of Aggregate Surface Morphology at the Interfacial Transition Zone ITZ

Determination of Aggregate Surface Morphology at the Interfacial Transition Zone ITZ

Published on: December 16, 2019

Related Experiment Videos

Last Updated: Sep 22, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Determination of Aggregate Surface Morphology at the Interfacial Transition Zone ITZ

Determination of Aggregate Surface Morphology at the Interfacial Transition Zone ITZ

Published on: December 16, 2019

Area of Science:

Statistics
Data Mining
Machine Learning

Background:

Compositional data analysis (CoDa) often requires data transformations for standard clustering methods.
Existing clustering algorithms may struggle with the unique constraints of compositional data, such as the unit sum property.

Purpose of the Study:

To propose and evaluate a model-based clustering method specifically designed for compositional data.
To address the limitations of existing methods by directly handling the unit sum constraint without transformations.

Main Methods:

Development of a mixture model utilizing the Dirichlet distribution to accommodate compositional data.
Implementation of a modified hard Expectation-Maximization (EM) algorithm to prevent empty clusters and ensure convergence.
Rigorous simulation studies across various dimensions, cluster numbers, and overlap levels.

Main Results:

The proposed Dirichlet mixture model demonstrated robust performance in clustering simulated compositional data.
Comparative analysis showed the new method outperforming popular algorithms like KMeans, Gaussian Mixture Models (GMM), and Partition Around Medoids (PAM).
Successful application to real-world datasets from business/marketing and physical sciences, highlighting its practical utility.

Conclusions:

The Dirichlet mixture model offers a powerful and effective approach for clustering compositional data.
The modified hard EM algorithm successfully overcomes convergence issues, making the method reliable.
This approach provides a valuable alternative for analyzing compositional data without prior transformations.