Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test

Quantifying and Rejecting Outliers: The Grubbs Test

Sometimes, a data set can have a recorded numerical observation that greatly deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier. To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...

Detection of Gross Error: The Q Test

Detection of Gross Error: The Q Test

When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...

Distributions to Estimate Population Parameter

Distributions to Estimate Population Parameter

The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...

Truncation in Survival Analysis

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...

Choosing Between z and t Distribution

Choosing Between z and t Distribution

The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...

Outliers and Influential Points

Outliers and Influential Points

An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Spatially Correlated Analysis of Infectious Disease Outcomes Based on Bayesian Functional Hierarchical Models.

Statistics in medicine·2026

Same author

Construction of confidence intervals for risk difference with paired correlated data using saddlepoint approximation.

Journal of biopharmaceutical statistics·2026

Same author

Partially Linear Additive Quantile Regression: Theory and Applications to Breast Cancer Patients' Survival.

Statistics in medicine·2026

Same author

The adaptive functional piecewise ordered weighted averaging method and its application to pollutant concentration analysis.

PloS one·2026

Same author

Joint modeling of composite quantile regression for multiple ordinal longitudinal data with its applications to a dementia dataset.

Statistical methods in medical research·2026

Same author

Functional varying-coefficient Cox model and its application.

Statistical methods in medical research·2026

Same journal

Asymptotic online FWER control for dependent test statistics.

Statistical methods in medical research·2026

Same journal

Regression analysis of misclassified current status data with potentially unknown test accuracy.

Statistical methods in medical research·2026

Same journal

Bayesian multivariate linear mixed-effects models with varied association structures.

Statistical methods in medical research·2026

Same journal

Inference about the ratio of age-standardized rates between two overlapping populations.

Statistical methods in medical research·2026

Same journal

A robust neural network with random effects for subject-specific prediction of clustered count data.

Statistical methods in medical research·2026

Same journal

A comparison of methods for designing hybrid type 2 cluster-randomized trials with continuous effectiveness and implementation endpoints.

Statistical methods in medical research·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Dec 13, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Variable selection for ultra-high dimensional quantile regression with missing data and measurement error.

Yongxin Bai¹, Maozai Tian^1,2,3, Man-Lai Tang^4,5

¹Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, China.

Statistical Methods in Medical Research

|August 5, 2020

Summary

This summary is machine-generated.

This study introduces a novel method for variable selection in ultra-high dimensional quantile regression, addressing missing data and measurement errors. The approach ensures accurate model estimation and variable identification, even with complex data challenges.

Keywords:

Atan penalty HBIC criterion Quantile regression measurement error missing data

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Related Experiment Videos

Last Updated: Dec 13, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

Area of Science:

Statistics
Biostatistics
Machine Learning

Background:

Ultra-high dimensional data presents challenges for traditional statistical models.
Missing data and measurement errors in covariates can bias regression results.
Quantile regression is valuable for understanding conditional distributions beyond the mean.

Purpose of the Study:

To develop a robust variable selection method for ultra-high dimensional quantile regression.
To address and correct for bias introduced by missing data and measurement errors.
To achieve simultaneous variable selection and parameter estimation.

Main Methods:

Orthogonal quantile regression to correct measurement error bias.
Inverse probability weighting to handle missing data.
Nonconvex Atan penalized estimation for variable selection and estimation.

Main Results:

The proposed method achieves oracle properties under relaxed conditions.
Demonstrated effectiveness through Monte Carlo simulations.
Successful application to a real-world breast cancer dataset.

Conclusions:

The developed procedure offers a reliable solution for variable selection in complex high-dimensional settings.
The method effectively handles both missing data and measurement errors.
Provides a valuable tool for analyzing large-scale biological and medical datasets.