Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Cluster Sampling Method

Cluster Sampling Method

Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...

Variability: Analysis

Variability: Analysis

Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...

Outliers and Influential Points

Outliers and Influential Points

An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...

Column Efficiency: Rate Theory

Column Efficiency: Rate Theory

The rate theory of chromatography provides quantitative insight into the shapes and widths of elution bands. These bands are based on the random-walk mechanism governing molecular migration within a column. The Gaussian profile of chromatographic bands arises from the cumulative effect of random molecular motions as they progress through the column.
During elution, a solute molecule experiences numerous transitions between stationary and mobile phases, exhibiting irregular residence times in...

Coefficient of Correlation

Coefficient of Correlation

The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Ionic liquid-functionalized isomeric covalent organic frameworks for enhanced photocatalytic hydrogen peroxide production under visible-light irradiation.

Journal of colloid and interface science·2026

Same author

Development of a HER1/CP2c dual-targeting biopharmaceutical for HER1-overexpressing head and neck cancer.

Biomaterials advances·2026

Same author

Preclinical Proof of Concept for the Single-Protein Anticancer Molecule Targeting Both a Tumor Surface Antigen and an Intracellular Oncoprotein.

Advanced healthcare materials·2025

Same author

Spatial pattern of spring dissolved organic matter and microbial communities under dual anthropogenic-natural forcing in a tropical semi-enclosed bay.

Marine environmental research·2025

Same author

Development of recombinant Mesozumab-CPTin that dual-targets mesothelin and CP2c for anticancer therapy.

Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie·2025

Same author

A rare case of peripheral ameloblastic fibroma manifesting as a maxillary gingival mass.

BMC oral health·2025

Same journal

The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.

Information and inference : a journal of the IMA·2025

Same journal

Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise.

Information and inference : a journal of the IMA·2024

Same journal

Phase transition and higher order analysis of <i>L</i> regularization under dependence.

Information and inference : a journal of the IMA·2024

Same journal

On statistical inference with high-dimensional sparse CCA.

Information and inference : a journal of the IMA·2023

Same journal

Black-box tests for algorithmic stability.

Information and inference : a journal of the IMA·2023

Same journal

Spectral top-down recovery of latent tree models.

Information and inference : a journal of the IMA·2023

See all related articles

Search research articles

Related Experiment Video

Updated: May 22, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Optimal variable clustering for high-dimensional matrix valued data.

Inbeom Lee¹, Siyi Deng², Yang Ning³

¹Booth School of Business, University of Chicago, 5807 S. Woodlawn Ave., Chicago, IL 60637, USA.

Information and Inference : a Journal of the IMA

|March 14, 2025

Summary

This summary is machine-generated.

This study introduces a novel latent variable model and hierarchical clustering algorithm for matrix-valued data, leveraging feature dependence structures. The proposed method achieves high-dimensional clustering consistency and optimal performance, outperforming existing techniques.

Keywords:

clustering high-dimensional estimation latent variable model matrix data minimax optimality

More Related Videos

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Related Experiment Videos

Last Updated: May 22, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

Area of Science:

Statistics
Machine Learning
Data Science

Background:

Matrix-valued data is increasingly common in various applications.
Existing clustering methods often focus on the mean model and neglect the informative feature dependence structure.
This limitation is particularly relevant in high-dimensional settings or when mean information is insufficient.

Purpose of the Study:

To develop a new latent variable model for matrix-valued data that utilizes the feature dependence structure for clustering.
To propose hierarchical clustering algorithms based on a weighted covariance matrix dissimilarity measure.
To theoretically analyze the clustering consistency and optimality of the proposed method.

Main Methods:

A novel latent variable model for matrix-valued data is proposed, incorporating row and column membership matrices.
A class of hierarchical clustering algorithms is developed using the difference of a weighted covariance matrix as a dissimilarity measure.
Theoretical analysis includes establishing clustering consistency in high-dimensional settings and deriving minimax lower bounds.

Main Results:

The proposed algorithm demonstrates clustering consistency under mild conditions in high-dimensional settings.
An optimal weight for the covariance matrix is identified, ensuring minimax rate-optimality.
Simulation studies show superior performance compared to existing methods, evidenced by a higher adjusted Rand index (ARI).

Conclusions:

The developed latent variable model and hierarchical clustering algorithm effectively utilize the dependence structure of matrix-valued data.
The method provides theoretical guarantees for high-dimensional clustering and achieves optimal performance.
The approach offers practical advantages and yields meaningful interpretations, as demonstrated on a genomic dataset.