Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Friedman Two-way Analysis of Variance by Ranks

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...

Significance Testing: Overview

Significance Testing: Overview

Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...

Study Design in Statistics

Study Design in Statistics

A study design is a set of techniques that allow a researcher to collect and analyze data from different variables defined for a specific research problem. Statistics is commonly for effective study design and more robust experiments,
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments with proper...

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

[Catalytic synthesis of dihydroavenanthramide D by lipase RWL].

Sheng wu gong cheng xue bao = Chinese journal of biotechnology·2026

Same author

Early Pregnancy Blood Pressure Trajectory Groups Predict Hypertensive Disorders of Pregnancy.

JACC. Advances·2026

Same author

Case Report: A rare co-occurrence of IgA pemphigus and pyoderma gangrenosum associated with IgA-κ type monoclonal gammopathy of undetermined significance: a 19-year diagnostic and therapeutic journey.

Frontiers in immunology·2026

Same author

Association of sarcopenia with the long-term risk of overall infections and infectious diseases: a prospective cohort study of 458 332 participants.

MedScience·2026

Same author

Association between serum meteorin-like protein and metabolic dysfunction-associated steatotic liver disease in patients with type 2 diabetes mellitus: a cross-sectional study.

Endocrine connections·2026

Same author

Optimizing temporal windows for wearable-augmented post-discharge risk prediction: a methods study.

Journal of the American Medical Informatics Association : JAMIA·2026

Same journal

Causally-interpretable random-effects meta-analysis.

Biometrics·2026

Same journal

Statistical inference for mean function of partially observed functional time series.

Biometrics·2026

Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

Biometrics·2026

Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

Biometrics·2026

Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

Biometrics·2026

Same journal

A Bayesian phase I/II platform design with data augmentation accounting for delayed outcomes.

Biometrics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Sep 13, 2025

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

Valid and efficient inference for nonparametric variable importance in two-phase studies.

Guorong Dai¹, Raymond J Carroll², Jinbo Chen³

¹Department of Statistics and Data Science, School of Management, Fudan University, Shanghai 200433, China.

|July 31, 2025

Summary

This summary is machine-generated.

Determining the value of costly covariates (Z) in prediction is crucial. This study introduces a nonparametric variable importance measure to assess Z's predictive contribution, even with incomplete data.

Keywords:

nonparametric inference statistical efficiency two-phase sampling variable importance

More Related Videos

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment

Published on: April 19, 2024

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

Related Experiment Videos

Last Updated: Sep 13, 2025

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment

Published on: April 19, 2024

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

Area of Science:

Statistics
Machine Learning
Data Science

Background:

Nonparametric regression often involves easily obtainable covariates (X) and costly covariates (Z).
Deciding whether to include expensive covariates (Z) in predictive models requires assessing their importance against data collection costs.

Purpose of the Study:

To develop a nonparametric variable importance measure for costly covariates (Z).
To infer the importance of Z in predicting Y, considering the presence of easily obtainable covariates (X).
To address the challenge of missing Z data in two-phase studies.

Main Methods:

Proposed a nonparametric variable importance measure for Z, aggregating maximum potential contributions.
Developed a novel inference approach for two-phase data with missing Z values.
Utilized functions of (Y, X) to impute contributions to predictive loss for individuals with missing Z.

Main Results:

The proposed approach provides unified and efficient inference for Z's importance, regardless of its actual contribution.
Demonstrated superior performance through simulations and real-world data analysis.
Established novel theoretical results in semi-supervised inference and two-phase nonparametric estimation.

Conclusions:

The developed variable importance measure effectively assesses the utility of costly covariates in prediction.
The novel inference method is robust to missing data, offering practical advantages in two-phase studies.
This research contributes to efficient model building when dealing with variable data collection costs.