Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics

Nonparametric statistics offer a powerful alternative to traditional parametric methods, useful when assumptions about the population distribution cannot be made. Unlike parametric tests, which require data to follow a specific distribution with well-defined parameters (such as the mean and standard deviation), nonparametric tests do not require such constraints. This makes them particularly valuable when dealing with small sample sizes, skewed data, or ordinal and categorical variables.
One of...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Statistical Methods to Analyze Parametric Data: ANOVA

Statistical Methods to Analyze Parametric Data: ANOVA

Analysis of Variance, or ANOVA, is a powerful statistical technique used to analyze parametric data, primarily in research and experimental studies. It's designed to compare the means of two or more groups, assisting researchers in identifying any significant differences between these group means. There are two main types of ANOVA based on the complexity of the analysis: one-way and two-way.
One-way ANOVA is applied when a single independent variable or factor is scrutinized. It compares...

Variability: Analysis

Variability: Analysis

Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

Statistical Methods to Analyze Parametric Data: Student t-Test and Goodness-of-Fit Test

In parametric statistics, two fundamental tests stand out for their utility and wide application: the Student's t-test and goodness-of-fit tests. These tests provide researchers with a robust method for drawing insights from data, testing hypotheses, and making informed decisions based on their findings.
The Student's t-test is a statistical test that examines if there is a statistically significant difference between the means of two groups. This test is instrumental when dealing with...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Identifying anaphylaxis using weakly-supervised prediction models and natural language processing.

medRxiv : the preprint server for health sciences·2026

Same author

Long-read deep sequencing reveals high rates of multilineage transmission and rapid viral population changes in acute HIV infection.

Nature communications·2026

Same author

Searching for immune correlates in Lassa vaccine development - workshop report.

NPJ vaccines·2026

Same author

Correlates of severe and delta COVID-19 in a phase 3 trial of the AZD1222 vaccine.

NPJ vaccines·2026

Same author

Validation of a Risk-Prediction Model in the Presence of Outcome Misclassification.

Statistics in medicine·2026

Same author

Influence of the broadly neutralizing antibody VRC01 on HIV breakthrough virus populations in antibody-mediated prevention trials.

Nature communications·2026

Same journal

Fast penalized generalized estimating equations for large longitudinal functional datasets.

Biometrics·2026

Same journal

Causally-interpretable random-effects meta-analysis.

Biometrics·2026

Same journal

Statistical inference for mean function of partially observed functional time series.

Biometrics·2026

Same journal

Subgroup identification via Interaction Tree and Mixed Model for Repeated Measures with application to Alzheimer's disease.

Biometrics·2026

Same journal

Finite mixtures of linear quantile regressions with concomitant variables: a solution to endogeneity in longitudinal data modeling.

Biometrics·2026

Same journal

Discussion on "INTACT: a method for integration of longitudinal physical activity data from multiple sources" by Jingru Zhang, Erjia Cui, Hongzhe Li, and Haochang Shou.

Biometrics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 6, 2025

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Nonparametric variable importance assessment using machine learning techniques.

Brian D Williamson¹, Peter B Gilbert^1,2, Marco Carone^1,2

¹Department of Biostatistics, University of Washington, Seattle, Washington, USA.

|October 12, 2020

Summary

This summary is machine-generated.

This study introduces a new, versatile variable importance measure applicable to any regression technique. This method allows for consistent interpretation and facilitates the use of machine learning for robust feature importance estimation.

Keywords:

machine learning nonparametric R2 statistical inference targeted learning variable importance

More Related Videos

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Dec 6, 2025

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Statistics
Machine Learning
Biostatistics

Background:

Quantifying feature importance is crucial in regression analysis.
Existing variable importance measures are often tied to specific regression techniques, limiting flexibility and comparability.
Suboptimal regression models may be used due to the lack of universally applicable importance measures.

Purpose of the Study:

To develop a technique-agnostic variable importance measure for regression.
To enable the use of machine learning for flexible estimation of feature importance.
To provide a consistent interpretation of variable importance across different analytical approaches.

Main Methods:

Generalization of the analysis of variance (ANOVA) variable importance measure.
Application of machine learning techniques for estimating feature importance.
Construction of an efficient estimator and a valid confidence interval for the proposed measure.

Main Results:

The proposed measure is independent of the chosen regression technique.
It allows for individual assessment of feature or group importance.
Simulations demonstrate good practical operating characteristics of the proposed method.

Conclusions:

The developed variable importance measure offers a flexible and consistently interpretable approach.
This method enhances the utility of machine learning in regression for feature importance analysis.
The approach is validated through simulations and applied to cardiovascular disease risk factor data.