Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Test for Homogeneity

Test for Homogeneity

The goodness–of–fit test can be used to decide whether a population fits a given distribution, but it will not suffice to decide whether two populations follow the same unknown distribution. A different test, called the test for homogeneity, can be used to conclude whether two populations have the same distribution. To calculate the test statistic for a test for homogeneity, follow the same procedure as with the test of independence. The hypotheses for the test for homogeneity can...

Factorial Design

Factorial Design

Factorial Analysis is an experimental design that applies Analysis of Variance (ANOVA) statistical procedures to examine a change in a dependent variable due to more than one independent variable, also known as factors. Changes in worker productivity can be reasoned, for example, to be influenced by salary and other conditions, such as skill level. One way to test this hypothesis is by categorizing salary into three levels (low, moderate, and high) and skills sets into two levels (entry level...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

One-Way ANOVA: Unequal Sample Sizes

One-Way ANOVA: Unequal Sample Sizes

One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Robust Construction of Diffusion MRI Atlases with Correction for Inter-Subject Fiber Dispersion.

Computational diffusion MRI : MICCAI Workshop·2017

Same author

Robust Fusion of Diffusion MRI Data for Template Construction.

Scientific reports·2017

Same author

Learning-Based Multimodal Image Registration for Prostate Cancer Radiation Therapy.

Medical image computing and computer-assisted intervention : MICCAI ... International Conference on Medical Image Computing and Computer-Assisted Intervention·2017

Same author

Segmenting hippocampal subfields from 3T MRI with multi-modality images.

Medical image analysis·2017

Same author

Joint Discriminative and Representative Feature Selection for Alzheimer's Disease Diagnosis.

Machine learning in medical imaging. MLMI (Workshop)·2017

Same author

Single- and Multiple-Shell Uniform Sampling Schemes for Diffusion MRI Using Spherical Codes.

IEEE transactions on medical imaging·2017

Same journal

A SEQUENTIAL SIGNIFICANCE TEST FOR TREATMENT BY COVARIATE INTERACTIONS.

Statistica Sinica·2026

Same journal

DEFINING AND ESTIMATING PRINCIPAL STRATUM SPECIFIC NATURAL MEDIATION EFFECTS WITH SEMI-COMPETING RISKS DATA.

Statistica Sinica·2026

Same journal

Longitudinal Modeling of Rank-based Global Outcome.

Statistica Sinica·2026

Same journal

INTEGRATING INCOMPLETE DATA FOR MEDIATION ANALYSIS.

Statistica Sinica·2026

Same journal

COMMUNITY EXTRACTION OF NETWORK DATA UNDER STOCHASTIC BLOCK MODELS.

Statistica Sinica·2026

Same journal

STATISTICAL INFERENCE FOR MEAN FUNCTIONS OF COMPLEX 3D OBJECTS.

Statistica Sinica·2025

See all related articles

Search research articles

Related Experiment Video

Updated: Jul 12, 2025

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

HIGH-DIMENSIONAL FACTOR REGRESSION FOR HETEROGENEOUS SUBPOPULATIONS.

Peiyao Wang¹, Quefeng Li¹, Dinggang Shen^2,3,4

¹University of North Carolina at Chapel Hill.

Statistica Sinica

|October 19, 2023

Summary

This summary is machine-generated.

This study introduces a novel factor regression model to effectively analyze complex, heterogeneous data by balancing global and group-specific approaches. The model demonstrates improved estimation and prediction consistency, offering a competitive and interpretable solution for diverse datasets.

Keywords:

Factor models heterogeneity penalized regression prediction

More Related Videos

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Published on: September 17, 2019

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Related Experiment Videos

Last Updated: Jul 12, 2025

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Author Spotlight: Integrated Multi-Omics Analysis for Unveiling Multicellular Immune Signatures in Clinical Heart Attack Cohorts

Published on: September 20, 2024

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Published on: September 17, 2019

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Area of Science:

Statistics
Biostatistics
Machine Learning

Background:

Scientific research frequently encounters data heterogeneity due to complex data structures.
Existing models often fail to adequately address this heterogeneity, either by ignoring it (global models) or by over-fitting (group-specific models).

Purpose of the Study:

To propose a novel factor regression model designed to handle data heterogeneity across subpopulations.
To provide a balanced approach that integrates both common and subpopulation-specific variations.

Main Methods:

Developed a factor regression model decomposing data into heterogeneous (latent factor-driven) and homogeneous (common variation) terms.
Proved estimation and prediction consistency of the proposed estimators.
Analyzed convergence rates compared to global and group-specific models.

Main Results:

The proposed model achieves better convergence rates than traditional global and group-specific models.
Estimation of latent factors is asymptotically negligible, maintaining the minimax rate.
Demonstrated robustness to model mis-specification and superior performance on real-world datasets (Alzheimer's Disease Neuroimaging Initiative, microarray data).

Conclusions:

The factor regression model offers a competitive and interpretable solution for analyzing heterogeneous data.
It effectively balances the trade-offs between global and group-specific modeling approaches.
The method shows promise for applications in various scientific research fields dealing with complex data structures.