Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

2.9K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
2.9K
Expected Frequencies in Goodness-of-Fit Tests01:19

Expected Frequencies in Goodness-of-Fit Tests

4.6K
A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n)  to the number of categories (k).
4.6K
Unusual Results01:16

Unusual Results

3.6K
Unusual results are those that have a very low chance of occurring. Unusual results can be identified using probabilities and the range rule of thumb. In problems involving probability, unusual results can be observed in 2 instances – an unusually high number of successes or an unusually low number of successes.
According to the range rule of thumb, any value above or below two standard deviations, 2σ  from the mean, μ  is considered unusual.
Maximum unusual value =...
3.6K
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

347
Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures...
347
Variability: Analysis01:11

Variability: Analysis

249
Measures of variability are statistical metrics that reveal the dispersion pattern within a dataset. They are pivotal in biostatistics, providing insights into the heterogeneity within health and biological data. Variability signifies the degree to which data points diverge from one another, helping researchers understand the potential range of values and associated uncertainty within the data.
The range is a simple measure of variability, indicating the difference between the highest and...
249
Routh-Hurwitz Criterion II01:19

Routh-Hurwitz Criterion II

556
In the application of the Routh-Hurwitz criterion, two specific scenarios can arise that complicate stability analysis.
The first scenario occurs when a singular zero appears in the first column of the Routh table. This situation creates a division by zero issues. To resolve this, a small positive or negative number, denoted as epsilon (∈), is substituted for the zero. The stability analysis proceeds by assuming a sign for ∈. If ∈ is positive, any sign change in the first...
556

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

A computationally efficient approach to false discovery rate control and power maximisation via randomisation and mirror statistic.

Statistical methods in medical research·2025
Same author

Dietary trajectories over 21 years and frailty in Norwegian older adults: the Tromsø Study 1994-2016.

European journal of nutrition·2024
Same author

Euthanasia of animals - association with veterinarians' suicidal thoughts and attitudes towards assisted dying in humans: a nationwide cross-sectional survey (the NORVET study).

BMC psychiatry·2024
Same author

Robust Minimum Divergence Estimation for the Multinomial Circular Logistic Regression Model.

Entropy (Basel, Switzerland)·2023
Same author

Robust inference for skewed data in health sciences.

Journal of applied statistics·2022
Same author

Videoconferencing in Pressure Injury: Randomized Controlled Telemedicine Trial in Patients With Spinal Cord Injury.

JMIR formative research·2022
Same journal

Asymptotic online FWER control for dependent test statistics.

Statistical methods in medical research·2026
Same journal

Regression analysis of misclassified current status data with potentially unknown test accuracy.

Statistical methods in medical research·2026
Same journal

Bayesian multivariate linear mixed-effects models with varied association structures.

Statistical methods in medical research·2026
Same journal

Inference about the ratio of age-standardized rates between two overlapping populations.

Statistical methods in medical research·2026
Same journal

A robust neural network with random effects for subject-specific prediction of clustered count data.

Statistical methods in medical research·2026
Same journal

A comparison of methods for designing hybrid type 2 cluster-randomized trials with continuous effectiveness and implementation endpoints.

Statistical methods in medical research·2026
See all related articles

Related Experiment Video

Updated: Nov 4, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K

A robust variable screening procedure for ultra-high dimensional data.

Abhik Ghosh1, Magne Thoresen2

  • 1Interdisciplinary Statistical Research Unit, Indian Statistical Institute, Kolkata, India.

Statistical Methods in Medical Research
|May 31, 2021
PubMed
Summary
This summary is machine-generated.

This study introduces a new robust variable screening method, Density Power Divergence-Sure Independence Screening (DPD-SIS), to address issues with outliers in ultra-high dimensional data. DPD-SIS demonstrates superior performance compared to existing methods, especially in small samples with contaminated data.

Keywords:
NP dimensionalityVariable selectiongene selectionindependence screeninginfluence functionminimum density power divergence estimator

More Related Videos

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K
Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

17.1K

Related Experiment Videos

Last Updated: Nov 4, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.8K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.7K
Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

17.1K

Area of Science:

  • Statistics
  • Computational Biology
  • Genomics

Background:

  • Variable selection is crucial in ultra-high dimensional regression.
  • Existing methods like Sure Independence Screening (SIS) are sensitive to outliers, potentially causing inaccurate results, especially in small datasets.
  • Robust pre-screening methods are needed to overcome these limitations.

Purpose of the Study:

  • To develop a novel robust variable screening procedure for ultra-high dimensional data.
  • To enhance the reliability of variable selection in the presence of outliers.
  • To introduce Density Power Divergence-Sure Independence Screening (DPD-SIS) and its iterative extension.

Main Methods:

  • Development of a new robust screening procedure based on Density Power Divergence (DPD) estimation.
  • Introduction of DPD-SIS and iterative DPD-SIS.
  • Extensive simulation studies to evaluate method performance.

Main Results:

  • DPD-SIS and iterative DPD-SIS show superior performance compared to original SIS and other robust methods when outliers are present.
  • The proposed methods are particularly effective in small samples with data contamination.
  • Demonstrated utility in a real-world application concerning lipid metabolism regulation.

Conclusions:

  • DPD-SIS offers a robust and reliable alternative for variable selection in ultra-high dimensional regression, especially when data contains outliers.
  • The proposed methods mitigate the drawbacks of traditional SIS, leading to more accurate inference.
  • This approach has significant implications for statistical modeling in fields prone to noisy data.