Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Introduction to Nonparametric Statistics01:28

Introduction to Nonparametric Statistics

630
Nonparametric statistics offer a powerful alternative to traditional parametric methods, useful when assumptions about the population distribution cannot be made. Unlike parametric tests, which require data to follow a specific distribution with well-defined parameters (such as the mean and standard deviation), nonparametric tests do not require such constraints. This makes them particularly valuable when dealing with small sample sizes, skewed data, or ordinal and categorical variables.
One of...
630
Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data01:16

Statistical Inference Techniques in Hypothesis Testing: Parametric Versus Nonparametric Data

100
Statistical inference techniques, paramount in hypothesis testing, differentiate into two broad categories: parametric and nonparametric statistics.
Parametric statistics, as the name suggests, assumes that data follow a specific distribution, often a normal distribution. This assumption enables robust hypothesis testing and estimation. Parametric methods, like the Student's t-test or Goodness-of-fit test, are frequently employed in biostatistics due to their robustness. For instance,...
100
Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

1.3K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
1.3K
Fisher's Exact Test01:08

Fisher's Exact Test

232
Fisher's exact test is a statistical significance test widely used to analyze 2x2 contingency tables, particularly in situations where sample sizes are small. Unlike the chi-squared test, which approximates P-values and assumes minimum expected frequencies of at least five in each cell, Fisher's exact test calculates the exact probability (P-value) of observing the data or more extreme results under the null hypothesis. This feature makes it especially valuable when the assumptions of...
232
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

2.5K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
2.5K
Statistical Package for the Social Sciences (SPSS)01:22

Statistical Package for the Social Sciences (SPSS)

208
The Statistical Package for the Social Sciences, or SPSS, is a data management and analysis software suite. Developed by SPSS Inc. in 1968 and acquired by IBM in 2009, this tool was initially designed for social science data analysis, evolving to serve a wider range of disciplines. It was later renamed to Statistical Product and Service Solutions.
SPSS streamlines the process from data preparation to analysis and reporting. It is characterized by its user-friendly interface, which conceals...
208

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Robust discovery of mutational signatures using power posteriors.

PLoS computational biology·2026
Same author

Sequential Gibbs posteriors with applications to principal component analysis.

Biometrika·2026
Same author

Scalable and robust regression models for continuous proportional data.

Journal of the American Statistical Association·2026
Same author

Local graph estimation with pathwise false discovery control.

Nature communications·2026
Same author

Bayesian Transfer Learning.

Statistical science : a review journal of the Institute of Mathematical Statistics·2026
Same author

Domain Adaptive Bootstrap Aggregating.

IEEE transactions on signal processing : a publication of the IEEE Signal Processing Society·2026
Same journal

conMItion: an R package adjusting confounding factors for associations in multi-omics.

Bioinformatics (Oxford, England)·2026
Same journal

SpaMFG: a Spatial Multi-omics Integration Method based on Feature Grouping.

Bioinformatics (Oxford, England)·2026
Same journal

CSCN: Inference of Cell-Specific Causal Networks Using Single-Cell RNA-Seq Data.

Bioinformatics (Oxford, England)·2026
Same journal

Sparse CCA-Based Mediation Analysis with High-Dimensional Exposures and Mediators.

Bioinformatics (Oxford, England)·2026
Same journal

Enhancing Cross-Context Generalization in Drug Perturbation Prediction with a Multimodal Conditional Diffusion Framework.

Bioinformatics (Oxford, England)·2026
Same journal

Primer Design through Submodular Function Estimation.

Bioinformatics (Oxford, England)·2026
See all related articles

Related Experiment Video

Updated: May 14, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K

Nonparametric IPSS: fast, flexible feature selection with false discovery control.

Omar Melikechi1, David B Dunson2, Jeffrey W Miller1

  • 1Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, United States.

Bioinformatics (Oxford, England)
|May 13, 2025
PubMed
Summary
This summary is machine-generated.

This study introduces Integrated Path Stability Selection (IPSS), a novel feature selection method offering robust false discovery control and improved true positive identification for high-dimensional data. IPSS methods like IPSSGB and IPSSRF demonstrate superior performance in simulations and cancer-related gene discovery.

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.4K
A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance
09:01

A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance

Published on: May 7, 2014

10.1K

Related Experiment Videos

Last Updated: May 14, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.4K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.4K
A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance
09:01

A Method for Investigating Age-related Differences in the Functional Connectivity of Cognitive Control Networks Associated with Dimensional Change Card Sort Performance

Published on: May 7, 2014

10.1K

Area of Science:

  • Machine Learning
  • Statistical Methods
  • Bioinformatics

Background:

  • Feature selection is crucial in machine learning and statistics.
  • Existing methods often rely on parametric models, lack theoretical false discovery control, or identify limited true positives.

Purpose of the Study:

  • Introduce a general, nonparametric feature selection method with finite-sample false discovery control.
  • Enhance the identification of true positives while maintaining statistical rigor.
  • Provide efficient and accurate tools for high-dimensional data analysis.

Main Methods:

  • Utilize Integrated Path Stability Selection (IPSS) applied to arbitrary feature importance scores.
  • Develop specific implementations: IPSS Gradient Boosting (IPSSGB) and IPSS Random Forests (IPSSRF).
  • Estimate q-values for better suitability in high-dimensional settings compared to P-values.

Main Results:

  • IPSSGB and IPSSRF demonstrate accurate false discovery rate control in nonlinear simulations.
  • Both methods significantly outperform existing approaches in detecting true positives.
  • Achieve high efficiency, running in under 20 seconds for datasets with 500 samples and 5000 features.
  • Applied to cancer data, IPSSGB and IPSSRF yield improved predictions with fewer features.

Conclusions:

  • IPSS offers a powerful and flexible framework for feature selection in high-dimensional data.
  • The developed IPSSGB and IPSSRF methods provide accurate and efficient solutions for biological data analysis.
  • These methods advance the field by improving both statistical control and discovery power in feature selection.