Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Residual Plots01:07

Residual Plots

5.0K
A residual plot is a statistical representation of data used to analyze correlation and regression results. It helps verify the requirements for drawing specific conclusions about correlation and regression. To obtain the residual plot, first, the residual for each data value is calculated, which is simply the vertical distance between the observed and the predicted value obtained from the regression equation.
When the residual values are plotted against the variable x, it is called a residual...
5.0K
Significance Testing: Overview01:04

Significance Testing: Overview

3.8K
Significance testing is a set of statistical methods used to test whether a claim about a parameter is valid. In analytical chemistry, significance testing is used primarily to determine whether the difference between two values comes from determinate or random errors. The effect of a particular change in the measurement protocol, analyst, or sample itself can cause a deviation from the expected result. In the case of a suspected deviation/outlier, we need to be able to confirm mathematically...
3.8K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

2.6K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
2.6K
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.0K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.0K
Residuals and Least-Squares Property01:11

Residuals and Least-Squares Property

7.8K
The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...
7.8K
Fisher's Exact Test01:08

Fisher's Exact Test

791
Fisher's exact test is a statistical significance test widely used to analyze 2x2 contingency tables, particularly in situations where sample sizes are small. Unlike the chi-squared test, which approximates P-values and assumes minimum expected frequencies of at least five in each cell, Fisher's exact test calculates the exact probability (P-value) of observing the data or more extreme results under the null hypothesis. This feature makes it especially valuable when the assumptions of...
791

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Accelerating item factor analysis on GPU with Python package xifa.

Behavior research methods·2023
Same author

Condomless Anal Sex Associated With Heterogeneous Profiles Of HIV Pre-Exposure Prophylaxis Use and Sexual Activities Among Men Who Have Sex With Men: A Latent Class Analysis Using Sex Diary Data on a Mobile App.

Journal of medical Internet research·2021
Same author

Mobile App (UPrEPU) to Monitor Adherence to Pre-exposure Prophylaxis in Men Who Have Sex With Men: Protocol for a User-Centered Approach to Mobile App Design and Development.

JMIR research protocols·2020
Same journal

Proficiency order invariance of MLE, MAP, EAP, and WLE in item response theory.

The British journal of mathematical and statistical psychology·2026
Same journal

Bias and precision in true-score estimation.

The British journal of mathematical and statistical psychology·2026
Same journal

Polychoric correlations under the assumption of elliptical latent traits.

The British journal of mathematical and statistical psychology·2026
Same journal

Regularized reduced rank regression for mixed predictor and response variables.

The British journal of mathematical and statistical psychology·2026
Same journal

A multiple-choice SDT model for cognitive diagnosis models.

The British journal of mathematical and statistical psychology·2026
Same journal

Modular item response and structural equation modelling via measurement and uncertainty preserving parametric modelling.

The British journal of mathematical and statistical psychology·2026
See all related articles

Related Experiment Video

Updated: Sep 9, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K

Residual permutation tests for feature importance in machine learning.

Po-Hsien Huang1

  • 1National Chengchi University, Taipei City, Taiwan.

The British Journal of Mathematical and Statistical Psychology
|August 30, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces residual permutation tests (RPTs) for machine learning (ML) hypothesis testing. RPT-X effectively assesses feature significance, maintaining statistical accuracy across various ML models.

Keywords:
feature importancemachine learningpermutation test

More Related Videos

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

892
Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

6.9K

Related Experiment Videos

Last Updated: Sep 9, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.6K
Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers
03:37

Author Spotlight: Impact of Intergenic Interactions on Disease-Identifying Dark Biomarkers

Published on: March 1, 2024

892
Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model
07:15

Machine Learning Algorithms for Early Detection of Bone Metastases in an Experimental Rat Model

Published on: August 16, 2020

6.9K

Area of Science:

  • Psychology
  • Computer Science
  • Statistics

Background:

  • Traditional psychological research heavily utilizes linear models for hypothesis testing.
  • Machine learning (ML) offers advanced methods for exploring complex, non-linear variable relationships.
  • Current feature importance tools in ML lack robust statistical inference capabilities.

Purpose of the Study:

  • To develop statistically sound methods for hypothesis testing within machine learning frameworks.
  • To introduce residual permutation tests (RPTs) as a tool for assessing feature significance in ML models.
  • To address the gap in inferential statistics for interpreting 'black-box' ML algorithms.

Main Methods:

  • Introduced two variants of residual permutation tests: RPT on Y (RPT-Y) and RPT on X (RPT-X).
  • RPT-Y permutes label residuals conditioned on other features.
  • RPT-X permutes target feature residuals conditioned on other features.
  • Conducted a comprehensive simulation study across diverse ML algorithms.

Main Results:

  • RPT-X demonstrated stable empirical Type I error rates below the nominal level.
  • RPT-X showed appropriate statistical power in both regression and classification tasks.
  • The study validated RPT-X performance across a wide range of ML algorithms.

Conclusions:

  • Residual permutation tests, particularly RPT-X, provide a valid approach for statistical inference in ML.
  • RPT-X is a valuable tool for hypothesis testing, enhancing the interpretability of ML models.
  • The findings support the broader adoption of RPT-X in psychological research and other ML applications.