Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

212
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
212
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

298
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
298
Significance Testing: Overview01:04

Significance Testing: Overview

3.8K
Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...
3.8K
Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test01:09

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

1.8K
In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...
1.8K
Study Design in Statistics01:15

Study Design in Statistics

8.5K
A study design is a set of techniques that allow a researcher to collect and analyze data from different variables defined for a specific research problem. Statistics is commonly for effective study design and more robust experiments,
Does aspirin reduce the risk of heart attacks? Is one brand of fertilizer more effective at growing roses than another? Is fatigue as dangerous to a driver as the influence of alcohol? Questions like these are answered using randomized experiments with proper...
8.5K
Comparing the Survival Analysis of Two or More Groups01:20

Comparing the Survival Analysis of Two or More Groups

289
Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...
289

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

[Catalytic synthesis of dihydroavenanthramide D by lipase RWL].

Sheng wu gong cheng xue bao = Chinese journal of biotechnology·2026
Same author

Early Pregnancy Blood Pressure Trajectory Groups Predict Hypertensive Disorders of Pregnancy.

JACC. Advances·2026
Same author

Case Report: A rare co-occurrence of IgA pemphigus and pyoderma gangrenosum associated with IgA-κ type monoclonal gammopathy of undetermined significance: a 19-year diagnostic and therapeutic journey.

Frontiers in immunology·2026
Same author

Association of sarcopenia with the long-term risk of overall infections and infectious diseases: a prospective cohort study of 458 332 participants.

MedScience·2026
Same author

Association between serum meteorin-like protein and metabolic dysfunction-associated steatotic liver disease in patients with type 2 diabetes mellitus: a cross-sectional study.

Endocrine connections·2026
Same author

Optimizing temporal windows for wearable-augmented post-discharge risk prediction: a methods study.

Journal of the American Medical Informatics Association : JAMIA·2026
Same journal

Causally-interpretable random-effects meta-analysis.

Biometrics·2026
Same journal

Statistical inference for mean function of partially observed functional time series.

Biometrics·2026
Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

Biometrics·2026
Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

Biometrics·2026
Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

Biometrics·2026
Same journal

A Bayesian phase I/II platform design with data augmentation accounting for delayed outcomes.

Biometrics·2026
See all related articles

Related Experiment Video

Updated: Sep 13, 2025

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups
14:14

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

6.0K

Valid and efficient inference for nonparametric variable importance in two-phase studies.

Guorong Dai1, Raymond J Carroll2, Jinbo Chen3

  • 1Department of Statistics and Data Science, School of Management, Fudan University, Shanghai 200433, China.

Biometrics
|July 31, 2025
PubMed
Summary
This summary is machine-generated.

Determining the value of costly covariates (Z) in prediction is crucial. This study introduces a nonparametric variable importance measure to assess Z's predictive contribution, even with incomplete data.

Keywords:
nonparametric inferencestatistical efficiencytwo-phase samplingvariable importance

More Related Videos

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment
08:36

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment

Published on: April 19, 2024

706
A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

18.0K

Related Experiment Videos

Last Updated: Sep 13, 2025

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups
14:14

The Innovation Arena: A Method for Comparing Innovative Problem-Solving Across Groups

Published on: May 13, 2022

6.0K
Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment
08:36

Author Spotlight: Evaluating the Adjuvant Efficacy and Safety of Angong Niuhuang Pill in Viral Encephalitis Treatment

Published on: April 19, 2024

706
A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research
09:35

A Protocol for Using Gene Set Enrichment Analysis to Identify the Appropriate Animal Model for Translational Research

Published on: August 16, 2017

18.0K

Area of Science:

  • Statistics
  • Machine Learning
  • Data Science

Background:

  • Nonparametric regression often involves easily obtainable covariates (X) and costly covariates (Z).
  • Deciding whether to include expensive covariates (Z) in predictive models requires assessing their importance against data collection costs.

Purpose of the Study:

  • To develop a nonparametric variable importance measure for costly covariates (Z).
  • To infer the importance of Z in predicting Y, considering the presence of easily obtainable covariates (X).
  • To address the challenge of missing Z data in two-phase studies.

Main Methods:

  • Proposed a nonparametric variable importance measure for Z, aggregating maximum potential contributions.
  • Developed a novel inference approach for two-phase data with missing Z values.
  • Utilized functions of (Y, X) to impute contributions to predictive loss for individuals with missing Z.

Main Results:

  • The proposed approach provides unified and efficient inference for Z's importance, regardless of its actual contribution.
  • Demonstrated superior performance through simulations and real-world data analysis.
  • Established novel theoretical results in semi-supervised inference and two-phase nonparametric estimation.

Conclusions:

  • The developed variable importance measure effectively assesses the utility of costly covariates in prediction.
  • The novel inference method is robust to missing data, offering practical advantages in two-phase studies.
  • This research contributes to efficient model building when dealing with variable data collection costs.