Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

One-Way ANOVA: Unequal Sample Sizes01:15

One-Way ANOVA: Unequal Sample Sizes

5.8K
One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:
5.8K
One-Way ANOVA01:18

One-Way ANOVA

8.1K
One-way ANOVA analyzes more than three samples categorized by one factor. For example, it can compare the average mileage of sports bikes. Here, the data is categorized by one factor - the company. However, one-way ANOVA cannot be used to simultaneously compare the sample mean of three or more samples categorized by two factors. An example of two factors would be sports bikes from different companies driven in different terrains, such as a desert or snowy landscape. Here, two-way ANOVA is used...
8.1K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

263
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
263
One-Way ANOVA: Equal Sample Sizes01:15

One-Way ANOVA: Equal Sample Sizes

3.4K
One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...
3.4K
Two-Way ANOVA01:17

Two-Way ANOVA

2.7K
The two-way ANOVA is an extension of the one-way ANOVA. It is a statistical test performed on three or more samples categorized by two factors - a row factor and a column factor. Ronald Fischer mentioned it in 1925 in his book 'Statistical Methods for Researchers.'
The two-way ANOVA analysis initially begins by stating the null hypothesis that there is an interaction effect between the two factors of a dataset. This effect can be visualized using line segments formed by joining the...
2.7K
Test for Homogeneity01:23

Test for Homogeneity

2.0K
The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...
2.0K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Inference in High-Dimensional Online Changepoint Detection.

Journal of the American Statistical Association·2024
Same author

USP: an independence test that improves on Pearson's chi-squared and the <i>G</i>-test.

Proceedings. Mathematical, physical, and engineering sciences·2022
Same author

INSIGHT: A population-scale COVID-19 testing strategy combining point-of-care diagnosis with centralized high-throughput sequencing.

Science advances·2021
Same journal

Simplifying debiased inference via automatic differentiation and probabilistic programming.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Principal stratification with U-statistics under principal ignorability.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Causal K-Means Clustering.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Correction to: Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Harmonized Estimation of Subgroup-Specific Treatment Effects in Randomized Trials: The Use of External Control Data.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
See all related articles

Related Experiment Video

Updated: Aug 2, 2025

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.7K

High-dimensional principal component analysis with heterogeneous missingness.

Ziwei Zhu1,2, Tengyao Wang1,3, Richard J Samworth1

  • 1Statistical Laboratory University of Cambridge Cambridge UK.

Journal of the Royal Statistical Society. Series B, Statistical Methodology
|April 17, 2023
PubMed
Summary
This summary is machine-generated.

We introduce primePCA, a novel method for high-dimensional Principal Component Analysis (PCA) with missing data. It outperforms existing estimators, especially with heterogeneous missingness, achieving accurate principal component recovery.

Keywords:
heterogeneous missingnesshigh‐dimensional statisticsiterative projectionsmissing dataprincipal component analysis

More Related Videos

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

16.9K
Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

9.2K

Related Experiment Videos

Last Updated: Aug 2, 2025

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data
14:27

Identification of Disease-related Spatial Covariance Patterns using Neuroimaging Data

Published on: June 26, 2013

15.7K
Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

16.9K
Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment
06:48

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

9.2K

Area of Science:

  • Statistics
  • Machine Learning
  • Data Science

Background:

  • High-dimensional Principal Component Analysis (PCA) is crucial for dimensionality reduction.
  • Missing observations pose a significant challenge in PCA, degrading estimator performance.
  • Existing methods like observed-proportion weighted (OPW) estimators struggle with heterogeneous missingness.

Purpose of the Study:

  • To develop a robust method for high-dimensional PCA that effectively handles heterogeneous missing data.
  • To improve the accuracy and reliability of principal component estimation in the presence of missing observations.
  • To address the limitations of current estimators, particularly in realistic, non-uniform missing data scenarios.

Main Methods:

  • Introduced primePCA, an iterative imputation and singular space estimation method.
  • Leveraged the observed-proportion weighted (OPW) estimator as a starting point.
  • Utilized projection and singular value decomposition on imputed data matrices.

Main Results:

  • primePCA demonstrates geometric rate of convergence in noiseless cases with sufficient signal strength.
  • The method shows improved empirical performance over OPW, especially with heterogeneous missing data.
  • Theoretical guarantees depend on average missingness properties, not worst-case scenarios.

Conclusions:

  • primePCA offers a significant advancement for high-dimensional PCA with missing data, particularly in heterogeneous settings.
  • The method provides accurate principal component recovery where previous approaches failed.
  • Numerical studies confirm primePCA's effectiveness on both simulated and real-world datasets, even when data are not Missing Completely At Random.