Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Cluster Sampling Method01:20

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This number is...
One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation01:24

One-Compartment Open Model: Wagner-Nelson and Loo Riegelman Method for ka Estimation

This lesson introduces two critical methods in pharmacokinetics, the Wagner-Nelson and Loo-Riegelman methods, used for estimating the absorption rate constant (ka) for drugs administered via non-intravenous routes. The Wagner-Nelson method relates ka to the plasma concentration derived from the slope of a semilog percent unabsorbed time plot. However, it is limited to drugs with one-compartment kinetics and can be impacted by factors like gastrointestinal motility or enzymatic degradation.
On...
Sampling Plans01:23

Sampling Plans

Sampling is a crucial step in analytical chemistry, allowing researchers to collect representative data from a large population. Common sampling methods include random, judgmental, systematic, stratified, and cluster sampling.
Random sampling is a method where each member of the population has an equal chance of being selected for the sample. It involves selecting individuals randomly, often using random number generators or lottery-type methods. For example, when analyzing the properties of a...
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A cell wall-associated gene network shapes leaf boundary domains.

Development (Cambridge, England)·2022
Same author

Untargeted metabolomic analyses reveal the diversity and plasticity of the specialized metabolome in seeds of different Camelina sativa genotypes.

The Plant journal : for cell and molecular biology·2022
Same author

Error rate control for classification rules in multiclass mixture models.

The international journal of biostatistics·2021
Same author

Systemic control of nodule formation by plant nitrogen demand requires autoregulation-dependent and independent mechanisms.

Journal of experimental botany·2021
Same author

A Case of Gene Fragmentation in Plant Mitochondria Fixed by the Selection of a Compensatory Restorer of Fertility-Like PPR Gene.

Molecular biology and evolution·2021
Same author

Involvement of SUT1 and SUT2 Sugar Transporters in the Impairment of Sugar Transport and Changes in Phloem Exudate Contents in Phytoplasma-Infected Plants.

International journal of molecular sciences·2021
Same journal

Fast penalized generalized estimating equations for large longitudinal functional datasets.

Biometrics·2026
Same journal

Causally-interpretable random-effects meta-analysis.

Biometrics·2026
Same journal

Statistical inference for mean function of partially observed functional time series.

Biometrics·2026
Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

Biometrics·2026
Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

Biometrics·2026
Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

Biometrics·2026
See all related articles

Related Experiment Video

Updated: Jun 25, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Variable selection for clustering with Gaussian mixture models.

Cathy Maugis1, Gilles Celeux, Marie-Laure Martin-Magniette

  • 1Department of Mathematics, University Paris-Sud 11, Orsay, France. Cathy.Maugis@math.u-psud.fr

Biometrics
|February 13, 2009
PubMed
Summary
This summary is machine-generated.

This study introduces a new method for variable selection in model-based cluster analysis. The approach uses a generalized model and Bayesian information criterion for robust variable identification in clustering and regression tasks.

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: Jun 25, 2026

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

  • Statistics
  • Machine Learning
  • Bioinformatics

Background:

  • Variable selection is crucial for effective cluster analysis, particularly in model-based approaches.
  • Existing methods often require assumptions about variable relationships, limiting their applicability.
  • Identifying the role of each variable enhances the interpretability and performance of clustering models.

Purpose of the Study:

  • To propose a generalized model for variable selection in model-based cluster analysis.
  • To develop a method that does not assume linear relationships between selected and discarded variables.
  • To provide a statistically sound procedure for determining variable roles in clustering.

Main Methods:

  • A generalized model is proposed, extending previous work by Raftery and Dean.
  • Bayesian Information Criterion (BIC) is employed for model comparison and selection.
  • A novel algorithm combines backward stepwise selection for clustering and linear regression to ascertain variable roles.
  • Model identifiability and criterion consistency are theoretically established.

Main Results:

  • The proposed method effectively identifies relevant variables for cluster analysis without prior assumptions on variable relationships.
  • Numerical experiments on simulated datasets demonstrate the procedure's efficacy.
  • Application to genomic data highlights the practical utility and performance of the variable selection technique.

Conclusions:

  • The developed variable selection procedure offers a flexible and robust approach for model-based cluster analysis.
  • The method enhances the interpretability of clustering results by specifying variable roles.
  • This technique shows promise for applications in diverse fields, including bioinformatics and data mining.