Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

129
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
129
Cluster Sampling Method01:20

Cluster Sampling Method

11.6K
Appropriate sampling methods ensure that samples are drawn without bias and accurately represent the population. Because measuring the entire population in a study is not practical, researchers use samples to represent the population of interest.
To choose a cluster sample, divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample. For example, if you randomly sample four departments from your...
11.6K
Variability: Analysis01:11

Variability: Analysis

124
Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...
124
Outliers and Influential Points01:08

Outliers and Influential Points

3.9K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
3.9K
Column Efficiency: Rate Theory01:12

Column Efficiency: Rate Theory

232
The rate theory of chromatography provides quantitative insight into the shapes and widths of elution bands. These bands are based on the random-walk mechanism governing molecular migration within a column. The Gaussian profile of chromatographic bands arises from the cumulative effect of random molecular motions as they progress through the column.
During elution, a solute molecule experiences numerous transitions between stationary and mobile phases, exhibiting irregular residence times in...
232
Coefficient of Correlation01:12

Coefficient of Correlation

5.9K
The correlation coefficient, r, developed by Karl Pearson in the early 1900s, is numerical and provides a measure of strength and direction of the linear association between the independent variable x and the dependent variable y.
If you suspect a linear relationship between x and y, then r can measure how strong the linear relationship is.
What the VALUE of r tells us:
The value of r is always between –1 and +1: –1 ≤ r ≤ 1.
The size of the correlation r indicates the...
5.9K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Ionic liquid-functionalized isomeric covalent organic frameworks for enhanced photocatalytic hydrogen peroxide production under visible-light irradiation.

Journal of colloid and interface science·2026
Same author

Development of a HER1/CP2c dual-targeting biopharmaceutical for HER1-overexpressing head and neck cancer.

Biomaterials advances·2026
Same author

Preclinical Proof of Concept for the Single-Protein Anticancer Molecule Targeting Both a Tumor Surface Antigen and an Intracellular Oncoprotein.

Advanced healthcare materials·2025
Same author

Spatial pattern of spring dissolved organic matter and microbial communities under dual anthropogenic-natural forcing in a tropical semi-enclosed bay.

Marine environmental research·2025
Same author

Development of recombinant Mesozumab-CPTin that dual-targets mesothelin and CP2c for anticancer therapy.

Biomedicine & pharmacotherapy = Biomedecine & pharmacotherapie·2025
Same author

A rare case of peripheral ameloblastic fibroma manifesting as a maxillary gingival mass.

BMC oral health·2025
Same journal

The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.

Information and inference : a journal of the IMA·2025
Same journal

Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise.

Information and inference : a journal of the IMA·2024
Same journal

Phase transition and higher order analysis of <i>L</i> regularization under dependence.

Information and inference : a journal of the IMA·2024
Same journal

On statistical inference with high-dimensional sparse CCA.

Information and inference : a journal of the IMA·2023
Same journal

Black-box tests for algorithmic stability.

Information and inference : a journal of the IMA·2023
Same journal

Spectral top-down recovery of latent tree models.

Information and inference : a journal of the IMA·2023
See all related articles

Related Experiment Video

Updated: May 22, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

6.9K

Optimal variable clustering for high-dimensional matrix valued data.

Inbeom Lee1, Siyi Deng2, Yang Ning3

  • 1Booth School of Business, University of Chicago, 5807 S. Woodlawn Ave., Chicago, IL 60637, USA.

Information and Inference : a Journal of the IMA
|March 14, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces a novel latent variable model and hierarchical clustering algorithm for matrix-valued data, leveraging feature dependence structures. The proposed method achieves high-dimensional clustering consistency and optimal performance, outperforming existing techniques.

Keywords:
clusteringhigh-dimensional estimationlatent variable modelmatrix dataminimax optimality

More Related Videos

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.3K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.4K

Related Experiment Videos

Last Updated: May 22, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations
12:27

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

6.9K
ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
05:12

ExCYT: A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data

Published on: January 16, 2019

11.3K
A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments
08:12

A Psychophysics Paradigm for the Collection and Analysis of Similarity Judgments

Published on: March 1, 2022

2.4K

Area of Science:

  • Statistics
  • Machine Learning
  • Data Science

Background:

  • Matrix-valued data is increasingly common in various applications.
  • Existing clustering methods often focus on the mean model and neglect the informative feature dependence structure.
  • This limitation is particularly relevant in high-dimensional settings or when mean information is insufficient.

Purpose of the Study:

  • To develop a new latent variable model for matrix-valued data that utilizes the feature dependence structure for clustering.
  • To propose hierarchical clustering algorithms based on a weighted covariance matrix dissimilarity measure.
  • To theoretically analyze the clustering consistency and optimality of the proposed method.

Main Methods:

  • A novel latent variable model for matrix-valued data is proposed, incorporating row and column membership matrices.
  • A class of hierarchical clustering algorithms is developed using the difference of a weighted covariance matrix as a dissimilarity measure.
  • Theoretical analysis includes establishing clustering consistency in high-dimensional settings and deriving minimax lower bounds.

Main Results:

  • The proposed algorithm demonstrates clustering consistency under mild conditions in high-dimensional settings.
  • An optimal weight for the covariance matrix is identified, ensuring minimax rate-optimality.
  • Simulation studies show superior performance compared to existing methods, evidenced by a higher adjusted Rand index (ARI).

Conclusions:

  • The developed latent variable model and hierarchical clustering algorithm effectively utilize the dependence structure of matrix-valued data.
  • The method provides theoretical guarantees for high-dimensional clustering and achieves optimal performance.
  • The approach offers practical advantages and yields meaningful interpretations, as demonstrated on a genomic dataset.