Integrative High Dimensional Multiple Testing with Heterogeneity under Data Sharing Constraints | JoVE Visualize

Area of Science:

Genetics
Biostatistics
Computational Biology

Background:

High-dimensional regression requires identifying informative predictors, but signal detection is often limited by small sample sizes.
Meta-analysis of multiple studies can improve power but faces challenges with between-study heterogeneity and data sharing constraints.
Existing methods struggle with integrative analysis of high-dimensional data when only summary data is available.

Purpose of the Study:

To propose a novel data shielding integrative large-scale testing (DSILT) approach for signal detection in high-dimensional regression.
To address challenges of between-study heterogeneity and data sharing constraints in multi-site analyses.
To develop a robust method for identifying significant covariate effects while controlling false discovery rates.

Main Methods:

Proposed a data shielding integrative large-scale testing (DSILT) approach designed for high-dimensional regression with between-study heterogeneity.
Developed integrative estimation and debiasing procedures to construct test statistics for overall covariate effects without individual data sharing.
Implemented a multiple testing procedure to control the false discovery rate (FDR) and false discovery proportion (FDP).

Main Results:

DSILT allows for between-study heterogeneity and does not require individual-level data sharing.
The method successfully constructs test statistics for overall effects of covariates, assuming shared support across studies.
Simulation studies confirmed the procedure's effectiveness in controlling false discovery and achieving high power.

Conclusions:

The DSILT approach offers a powerful solution for signal detection in high-dimensional regression meta-analysis under data sharing constraints.
The method performs well in controlling false discoveries and maintaining statistical power, outperforming other distributed inference methods.
Applied to a real-world example, DSILT effectively detected interaction effects of genetic variants on type II diabetes risk.