Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

One-Way ANOVA: Unequal Sample Sizes

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA: Equal Sample Sizes

One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...

One-Way ANOVA

One-Way ANOVA

One-way ANOVA analyzes more than three samples categorized by one factor. For example, it can compare the average mileage of sports bikes. Here, the data is categorized by one factor - the company. However, one-way ANOVA cannot be used to simultaneously compare the sample mean of three or more samples categorized by two factors. An example of two factors would be sports bikes from different companies driven in different terrains, such as a desert or snowy landscape. Here, two-way ANOVA is used...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Kruskal-Wallis Test

Kruskal-Wallis Test

The Kruskal-Wallis test, also known as the Kruskal-Wallis H test, serves as a nonparametric alternative to the one-way ANOVA, offering a solution for analyzing the differences across three or more independent groups based on a single, ordinal-dependent variable. This statistical test is particularly valuable in scenarios where the data does not meet the normal distribution assumption required by its parametric counterparts. Kruskal-Wallis test is designed typically to handle ordinal data or...

Test for Homogeneity

Test for Homogeneity

The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Cluster-independent multiscale marker identification in single-cell RNA-seq data using localized marker detector (LMD).

Communications biology·2025

Same author

The <math><mi>G</mi></math> -invariant graph Laplacian Part I: Convergence rate and eigendecomposition.

Applied and computational harmonic analysis·2025

Same author

The <math><mi>G</mi></math> -invariant graph Laplacian part II: Diffusion maps.

Applied and computational harmonic analysis·2025

Same author

From disorganized data to emergent dynamic models: Questionnaires to partial differential equations.

PNAS nexus·2025

Same author

On learning what to learn: Heterogeneous observations of dynamics and establishing possibly causal relations among them.

PNAS nexus·2024

Same author

RATS: Unsupervised manifold learning using low-distortion alignment of tangent spaces.

bioRxiv : the preprint server for biology·2024

Same journal

Optimal variable clustering for high-dimensional matrix valued data.

Information and inference : a journal of the IMA·2025

Same journal

The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.

Information and inference : a journal of the IMA·2025

Same journal

Bi-stochastically normalized graph Laplacian: convergence to manifold Laplacian and robustness to outlier noise.

Information and inference : a journal of the IMA·2024

Same journal

Phase transition and higher order analysis of <i>L</i> regularization under dependence.

Information and inference : a journal of the IMA·2024

Same journal

On statistical inference with high-dimensional sparse CCA.

Information and inference : a journal of the IMA·2023

Same journal

Black-box tests for algorithmic stability.

Information and inference : a journal of the IMA·2023

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 9, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Two-sample statistics based on anisotropic kernels.

Xiuyuan Cheng¹, Alexander Cloninger², Ronald R Coifman³

¹Department of Mathematics, Duke University, Durham, NC, USA 27708.

Information and Inference : a Journal of the IMA

|September 15, 2020

Summary

This summary is machine-generated.

Researchers developed a novel kernel-based Maximum Mean Discrepancy (MMD) statistic to measure distribution differences using sample data. This new method enhances statistical power for detecting distribution variations, particularly in low-dimensional settings.

Keywords:

anisotropic kernel maximum mean discrepancy two-sample statistics

More Related Videos

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Related Experiment Videos

Last Updated: Dec 9, 2025

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Visualization Method for Proprioceptive Drift on a 2D Plane Using Support Vector Machine

Published on: October 27, 2016

Basics of Multivariate Analysis in Neuroimaging Data

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Area of Science:

Statistics
Machine Learning
Computational Biology

Background:

Comparing probability distributions is crucial in various scientific fields.
Existing methods like Maximum Mean Discrepancy (MMD) have limitations in detecting subtle differences, especially with limited data or complex distributions.
The need for more powerful and flexible statistical tests for distribution comparison is evident.

Purpose of the Study:

To introduce a novel kernel-based Maximum Mean Discrepancy (MMD) statistic for comparing multivariate distributions from finite samples.
To enhance the statistical power of distribution comparison tests by incorporating local geometric information (covariance matrices) and anisotropic kernels.
To establish theoretical guarantees for the proposed test, including consistency and finite-sample power bounds.

Main Methods:

Development of a new kernel-based MMD statistic utilizing local covariance matrices to construct an anisotropic kernel.
The kernel computes affinity between data points and a potentially smaller set of reference points.
Theoretical analysis to prove the consistency of the proposed test under mild kernel assumptions and derive finite-sample power bounds.

Main Results:

The proposed kernel-based MMD statistic provides a powerful method for distinguishing between distributions, especially when they are locally low-dimensional.
The test demonstrates consistency, ensuring reliable results as sample size increases.
A finite-sample lower bound for the testing power was obtained, quantifying the test's effectiveness with limited data.

Conclusions:

The novel anisotropic kernel-based MMD statistic offers a statistically sound and powerful approach for comparing distributions.
The method is particularly effective in scenarios with locally low-dimensional distributions and can handle asymmetric kernel computations.
Demonstrated applications in flow cytometry and diffusion MRI highlight the practical utility of the proposed method for distribution comparison.