Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Comparing the Survival Analysis of Two or More Groups

Comparing the Survival Analysis of Two or More Groups

Survival analysis is a cornerstone of medical research, used to evaluate the time until an event of interest occurs, such as death, disease recurrence, or recovery. Unlike standard statistical methods, survival analysis is particularly adept at handling censored data—instances where the event has not occurred for some participants by the end of the study or remains unobserved. To address these unique challenges, specialized techniques like the Kaplan-Meier estimator, log-rank test, and...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Statistical Hypothesis Testing

Statistical Hypothesis Testing

Hypothesis testing is a critical statistical procedure facilitating informed, evidence-based decisions. It begins with a hypothesis, which is a tentative explanation, or a prediction about a population parameter. This hypothesis can be either a null hypothesis (H0), indicating no effect or difference, or an alternative hypothesis (Ha), suggesting an effect or difference.
Statistical significance measures the probability that an observed result occurred by chance. If this probability, known as...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Rank-Based Transfer Learning for High-Dimensional Survival Data With Application to Sepsis Data.

Statistics in medicine·2026

Same author

A practical review of response-adaptive randomization: Under-explored challenges and potential directions.

Statistical methods in medical research·2026

Same author

Optimal treatment regimes in the presence of a cure fraction.

Statistical methods in medical research·2025

Same author

Interconnections of Multimorbidity-Related Clinical Outcomes: Analysis of Health Administrative Claims Data With a Dynamic Network Approach.

Statistics in medicine·2025

Same author

Group Response-Adaptive Randomization With Delayed and Missing Responses.

Statistics in medicine·2024

Same author

Incorporating prior information in gene expression network-based cancer heterogeneity analysis.

Biostatistics (Oxford, England)·2024

Same journal

A Mixture of Distributed Lag Non-Linear Models to Account for Spatially Heterogeneous Exposure-Lag-Response Associations.

Statistics in medicine·2026

Same journal

Practical Considerations for Gaussian Process Modeling for Causal Inference in Quasi-Experimental Studies With Panel Data.

Statistics in medicine·2026

Same journal

Covariate Adjustment for Wilcoxon Two Sample Statistic and Test.

Statistics in medicine·2026

Same journal

Beyond Fixed Thresholds: Optimizing Summaries of Wearable Device Data via Piecewise Linearization of Quantile Functions.

Statistics in medicine·2026

Same journal

A Causal Framework for Evaluating the Total Effect of Strategies Aiming to Expand Screening and to Improve Outcomes.

Statistics in medicine·2026

Same journal

Causal Effects on Nonterminal Event Time With Application to Antibiotic Usage and Future Resistance.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Oct 16, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Robust group variable screening based on maximum Lq-likelihood estimation.

Yang Li^1,2,3, Rong Li^2,3, Yichen Qin⁴

¹Center for Applied Statistics, Renmin University of China, Beijing, China.

Statistics in Medicine

|October 18, 2021

Summary

This summary is machine-generated.

This study introduces a robust group screening method for ultra-high-dimensional data. The novel approach effectively identifies important predictors within groups, even with contaminated data.

Keywords:

data contamination dimensionality reduction grouped variables robustness

More Related Videos

Psychophysically-anchored, Robust Thresholding in Studying Pain-related Lateralization of Oscillatory Prestimulus Activity

Psychophysically-anchored, Robust Thresholding in Studying Pain-related Lateralization of Oscillatory Prestimulus Activity

Published on: January 21, 2017

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

Related Experiment Videos

Last Updated: Oct 16, 2025

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Psychophysically-anchored, Robust Thresholding in Studying Pain-related Lateralization of Oscillatory Prestimulus Activity

Psychophysically-anchored, Robust Thresholding in Studying Pain-related Lateralization of Oscillatory Prestimulus Activity

Published on: January 21, 2017

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Lexical Decision Task for Studying Written Word Recognition in Adults with and without Dementia or Mild Cognitive Impairment

Published on: June 25, 2019

Area of Science:

Statistics
Data Science
Machine Learning

Background:

Variable screening is crucial in ultra-high-dimensional data analysis.
Existing methods often focus on individual predictors, neglecting group structures.
There's a need for robust screening methods that incorporate predictor relationships.

Purpose of the Study:

To develop a group screening procedure for ultra-high-dimensional data.
To enhance robustness against data contamination and heavy-tailed distributions.
To leverage the benefits of maximum Lq-likelihood estimation for variable selection.

Main Methods:

A novel group screening procedure based on maximum Lq-likelihood estimation.
Incorporation of predictor group structure information.
Rigorous establishment of the sure screening property.

Main Results:

The proposed method demonstrates robustness against data contamination.
Simulations show competitive performance compared to existing techniques.
The method effectively handles heavy-tailed distributions and mixed data observations.

Conclusions:

The developed group screening method offers a robust alternative for ultra-high-dimensional data analysis.
It effectively utilizes group structure for improved variable selection.
The method shows promise in real-world data applications, particularly with contaminated datasets.