Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

One-Way ANOVA: Unequal Sample Sizes01:15

One-Way ANOVA: Unequal Sample Sizes

7.0K
One-way ANOVA can be performed on three or more samples of unequal sizes. However, calculations get complicated when sample sizes are not always the same. So, while performing ANOVA with unequal samples size, the following equation is used:
7.0K
One-Way ANOVA: Equal Sample Sizes01:15

One-Way ANOVA: Equal Sample Sizes

4.4K
One-Way ANOVA can be performed on three or more samples with equal or unequal sample sizes. When one-way ANOVA is performed on two datasets with samples of equal sizes, it can be easily observed that the computed F statistic is highly sensitive to the sample mean.
Different sample means can result in different values for the variance estimate: variance between samples. This is because the variance between samples is calculated as the product of the sample size and the variance between the...
4.4K
Regression Analysis01:11

Regression Analysis

8.9K
Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:
8.9K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

578
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
578
Two-Way ANOVA01:17

Two-Way ANOVA

3.7K
The two-way ANOVA is an extension of the one-way ANOVA. It is a statistical test performed on three or more samples categorized by two factors - a row factor and a column factor. Ronald Fischer mentioned it in 1925 in his book 'Statistical Methods for Researchers.'
The two-way ANOVA analysis initially begins by stating the null hypothesis that there is an interaction effect between the two factors of a dataset. This effect can be visualized using line segments formed by joining the...
3.7K
Censoring Survival Data01:09

Censoring Survival Data

654
Survival analysis is a statistical method used to analyze time-to-event data, often employed in fields such as medicine, engineering, and social sciences. One of the key challenges in survival analysis is dealing with incomplete data, a phenomenon known as "censoring." Censoring occurs when the event of interest (such as death, relapse, or system failure) has not occurred for some individuals by the end of the study period or is otherwise unobservable, and it might have many different...
654

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Outcomes of Adolescents and Young Adults with AML Treated on Pediatric vs Adult Protocols.

Blood advances·2026
Same author

Bleeding in patients with lung adenocarcinoma receiving concurrent administration of anticoagulation and VEGF or EGFR inhibitors.

Journal of thrombosis and haemostasis : JTH·2026
Same author

Low Dose Tocilizumab for Mitigation of Cytokine Release Syndrome With T-Cell Engaging Bispecific Antibodies.

Clinical lymphoma, myeloma & leukemia·2026
Same author

Diagnosis Disclosure and Related Illness Experience in Patients With Multiple Myeloma and Precursor Plasma Cell Disorders.

Clinical lymphoma, myeloma & leukemia·2026
Same author

Functionally high-risk disease is associated with poor outcomes after late-line CAR T-cell therapy for multiple myeloma.

Blood cancer journal·2026
Same author

Dietary patterns among individuals with plasma cell disorders- opportunities for targeted interventions.

Blood cancer journal·2026
Same journal

A SEQUENTIAL SIGNIFICANCE TEST FOR TREATMENT BY COVARIATE INTERACTIONS.

Statistica Sinica·2026
Same journal

DEFINING AND ESTIMATING PRINCIPAL STRATUM SPECIFIC NATURAL MEDIATION EFFECTS WITH SEMI-COMPETING RISKS DATA.

Statistica Sinica·2026
Same journal

Longitudinal Modeling of Rank-based Global Outcome.

Statistica Sinica·2026
Same journal

COMMUNITY EXTRACTION OF NETWORK DATA UNDER STOCHASTIC BLOCK MODELS.

Statistica Sinica·2026
Same journal

STATISTICAL INFERENCE FOR MEAN FUNCTIONS OF COMPLEX 3D OBJECTS.

Statistica Sinica·2025
Same journal

High-dimensional Subgroup Regression Analysis.

Statistica Sinica·2025
See all related articles

Related Experiment Video

Updated: Apr 7, 2026

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities
10:26

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

4.5K

INTEGRATING INCOMPLETE DATA FOR MEDIATION ANALYSIS.

Andriy Derkach1, Joshua N Sampson2, Ruth M Pfeiffer2

  • 1Department of Epidemiology and Biostatistics, MSKCC, New York, NY 10017, USA.

Statistica Sinica
|April 6, 2026
PubMed
Summary
This summary is machine-generated.

This study introduces novel semiparametric methods for mediation analysis, enabling causal parameter estimation from incomplete datasets. These techniques efficiently combine information from multiple sources, even when only summary statistics are available.

Keywords:
Data integrationdirect and indirect effectssemiparametric likelihoodsummary level information

More Related Videos

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.3K
Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects
08:13

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Published on: May 10, 2019

6.9K

Related Experiment Videos

Last Updated: Apr 7, 2026

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities
10:26

Problem-Solving Before Instruction PS-I: A Protocol for Assessment and Intervention in Students with Different Abilities

Published on: September 11, 2021

4.5K
Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits
08:27

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

7.3K
Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects
08:13

Using the Race Model Inequality to Quantify Behavioral Multisensory Integration Effects

Published on: May 10, 2019

6.9K

Area of Science:

  • Biostatistics
  • Epidemiology
  • Causal Inference

Background:

  • Mediation analysis typically requires a single, complete dataset with exposure, mediator, and outcome variables.
  • Existing methods are limited by the need for complete data, hindering analysis when data is fragmented or only summary statistics are available.

Purpose of the Study:

  • To develop semiparametric methods for mediation analysis that can utilize incomplete datasets.
  • To enable the estimation of direct and indirect causal effects by combining information from multiple data sources, including summary statistics.

Main Methods:

  • Proposed semiparametric approach to estimate causal parameters (direct and indirect effects).
  • Methodology designed to integrate data from several incomplete datasets, each containing only two of the three key variables (exposure, mediator, outcome).
  • Capability to handle analyses using only summary statistics derived from these incomplete datasets.

Main Results:

  • The developed methods provide asymptotically unbiased and normally distributed estimates of causal parameters.
  • Simulations demonstrate the performance of the methods in finite samples and quantify efficiency loss due to incomplete data.
  • Application to breast cancer risk data investigates mediation by terminal duct lobular units between polygenic risk scores and cancer risk.

Conclusions:

  • Semiparametric methods offer a viable solution for mediation analysis when complete data is unavailable.
  • The proposed approach enhances the utility of fragmented datasets and summary statistics for causal inference.
  • The study successfully applies these methods to a relevant biomedical question in breast cancer etiology.