Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Introduction to Nonparametric Statistics

Introduction to Nonparametric Statistics

Nonparametric statistics offer a powerful alternative to traditional parametric methods, useful when assumptions about the population distribution cannot be made. Unlike parametric tests, which require data to follow a specific distribution with well-defined parameters (such as the mean and standard deviation), nonparametric tests do not require such constraints. This makes them particularly valuable when dealing with small sample sizes, skewed data, or ordinal and categorical variables.
One of...

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Fisher's Exact Test

Fisher's Exact Test

Fisher's exact test is a statistical significance test widely used to analyze 2x2 contingency tables, particularly in situations where sample sizes are small. Unlike the chi-squared test, which approximates P-values and assumes minimum expected frequencies of at least five in each cell, Fisher's exact test calculates the exact probability (P-value) of observing the data or more extreme results under the null hypothesis. This feature makes it especially valuable when the assumptions of...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Statistical Package for the Social Sciences (SPSS)

Statistical Package for the Social Sciences (SPSS)

The Statistical Package for the Social Sciences, or SPSS, is a data management and analysis software suite. Developed by SPSS Inc. in 1968 and acquired by IBM in 2009, this tool was initially designed for social science data analysis, evolving to serve a wider range of disciplines. It was later renamed to Statistical Product and Service Solutions.
SPSS streamlines the process from data preparation to analysis and reporting. It is characterized by its user-friendly interface, which conceals...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Robust discovery of mutational signatures using power posteriors.

PLoS computational biology·2026

Same author

Sequential Gibbs posteriors with applications to principal component analysis.

Biometrika·2026

Same author

Scalable and robust regression models for continuous proportional data.

Journal of the American Statistical Association·2026

Same author

Local graph estimation with pathwise false discovery control.

Nature communications·2026

Same author

Bayesian Transfer Learning.

Statistical science : a review journal of the Institute of Mathematical Statistics·2026

Same author

Domain Adaptive Bootstrap Aggregating.

IEEE transactions on signal processing : a publication of the IEEE Signal Processing Society·2026

Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026

Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026

Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026

Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026

Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026

Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026

See all related articles

Search research articles

Related Experiment Video

Updated: May 14, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Nonparametric IPSS: fast, flexible feature selection with false discovery control.

Omar Melikechi¹, David B Dunson², Jeffrey W Miller¹

¹Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, United States.

Bioinformatics (Oxford, England)

|May 13, 2025

Summary

This summary is machine-generated.

This study introduces Integrated Path Stability Selection (IPSS), a novel feature selection method offering robust false discovery control and improved true positive identification for high-dimensional data. IPSS methods like IPSSGB and IPSSRF demonstrate superior performance in simulations and cancer-related gene discovery.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance

A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance

Published on: May 7, 2014

Related Experiment Videos

Last Updated: May 14, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance

A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance

Published on: May 7, 2014

Area of Science:

Machine Learning
Statistical Methods
Bioinformatics

Background:

Feature selection is crucial in machine learning and statistics.
Existing methods often rely on parametric models, lack theoretical false discovery control, or identify limited true positives.

Purpose of the Study:

Introduce a general, nonparametric feature selection method with finite-sample false discovery control.
Enhance the identification of true positives while maintaining statistical rigor.
Provide efficient and accurate tools for high-dimensional data analysis.

Main Methods:

Utilize Integrated Path Stability Selection (IPSS) applied to arbitrary feature importance scores.
Develop specific implementations: IPSS Gradient Boosting (IPSSGB) and IPSS Random Forests (IPSSRF).
Estimate q-values for better suitability in high-dimensional settings compared to P-values.

Main Results:

IPSSGB and IPSSRF demonstrate accurate false discovery rate control in nonlinear simulations.
Both methods significantly outperform existing approaches in detecting true positives.
Achieve high efficiency, running in under 20 seconds for datasets with 500 samples and 5000 features.
Applied to cancer data, IPSSGB and IPSSRF yield improved predictions with fewer features.

Conclusions:

IPSS offers a powerful and flexible framework for feature selection in high-dimensional data.
The developed IPSSGB and IPSSRF methods provide accurate and efficient solutions for biological data analysis.
These methods advance the field by improving both statistical control and discovery power in feature selection.