Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Quantifying and Rejecting Outliers: The Grubbs Test01:02

Quantifying and Rejecting Outliers: The Grubbs Test

3.4K
Sometimes, a data set can have a recorded numerical observation that greatly  deviates from the rest of the data. Assuming that the data is normally distributed, a statistical method called the Grubbs test can be used to determine whether the observation is truly an outlier.  To perform a two-tailed Grubbs test, first, calculate the absolute difference between the outlier and the mean. Then, calculate the ratio between this difference and the standard deviation of the sample. This...
3.4K
Detection of Gross Error: The Q Test01:00

Detection of Gross Error: The Q Test

6.7K
When one or more data points appear far from the rest of the data, there is a need to determine whether they are outliers and whether they should be eliminated from the data set to ensure an accurate representation of the measured value. In many cases, outliers arise from gross errors (or human errors) and do not accurately reflect the underlying phenomenon. In some cases, however, these apparent outliers reflect true phenomenological differences. In these cases, we can use statistical methods...
6.7K
Distributions to Estimate Population Parameter01:26

Distributions to Estimate Population Parameter

4.9K
The accurate values of population parameters such as population proportion, population mean, and population standard deviation (or variance) are usually unknown. These are fixed values that can only be estimated from the data collected from the samples. The estimates of each of these parameters are sample proportion, the sample mean, and sample standard deviation (or variance). To obtain the values of these sample statistics, data are required that have particular distribution and central...
4.9K
Truncation in Survival Analysis01:09

Truncation in Survival Analysis

466
Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...
466
Choosing Between z and t Distribution01:25

Choosing Between z and t Distribution

3.4K
The z and the Student t distribution estimate the population mean using the sample mean and standard deviation. However, to decide which distribution to use for a calculation, one needs to determine the sample size, the nature of the distribution, and whether the population standard deviation is known. If the population standard deviation is known and the population is normally distributed, or if the sample size is greater than 30, the z distribution is preferred. The Student t distribution is...
3.4K
Outliers and Influential Points01:08

Outliers and Influential Points

5.7K
An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...
5.7K

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

Spatially Correlated Analysis of Infectious Disease Outcomes Based on Bayesian Functional Hierarchical Models.

Statistics in medicine·2026
Same author

Construction of confidence intervals for risk difference with paired correlated data using saddlepoint approximation.

Journal of biopharmaceutical statistics·2026
Same author

Partially Linear Additive Quantile Regression: Theory and Applications to Breast Cancer Patients' Survival.

Statistics in medicine·2026
Same author

The adaptive functional piecewise ordered weighted averaging method and its application to pollutant concentration analysis.

PloS one·2026
Same author

Joint modeling of composite quantile regression for multiple ordinal longitudinal data with its applications to a dementia dataset.

Statistical methods in medical research·2026
Same author

Functional varying-coefficient Cox model and its application.

Statistical methods in medical research·2026
Same journal

Asymptotic online FWER control for dependent test statistics.

Statistical methods in medical research·2026
Same journal

Regression analysis of misclassified current status data with potentially unknown test accuracy.

Statistical methods in medical research·2026
Same journal

Bayesian multivariate linear mixed-effects models with varied association structures.

Statistical methods in medical research·2026
Same journal

Inference about the ratio of age-standardized rates between two overlapping populations.

Statistical methods in medical research·2026
Same journal

A robust neural network with random effects for subject-specific prediction of clustered count data.

Statistical methods in medical research·2026
Same journal

A comparison of methods for designing hybrid type 2 cluster-randomized trials with continuous effectiveness and implementation endpoints.

Statistical methods in medical research·2026
See all related articles

Related Experiment Video

Updated: Dec 13, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K

Variable selection for ultra-high dimensional quantile regression with missing data and measurement error.

Yongxin Bai1, Maozai Tian1,2,3, Man-Lai Tang4,5

  • 1Center for Applied Statistics, School of Statistics, Renmin University of China, Beijing, China.

Statistical Methods in Medical Research
|August 5, 2020
PubMed
Summary
This summary is machine-generated.

This study introduces a novel method for variable selection in ultra-high dimensional quantile regression, addressing missing data and measurement errors. The approach ensures accurate model estimation and variable identification, even with complex data challenges.

Keywords:
Atan penaltyHBIC criterionQuantile regressionmeasurement errormissing data

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.6K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K

Related Experiment Videos

Last Updated: Dec 13, 2025

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances
07:35

Selecting Multiple Biomarker Subsets with Similarly Effective Binary Classification Performances

Published on: October 11, 2018

7.9K
Establishing a Competing Risk Regression Nomogram Model for Survival Data
04:57

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

10.6K
A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment
12:18

A Machine Learning Approach to Design an Efficient Selective Screening of Mild Cognitive Impairment

Published on: January 11, 2020

7.8K

Area of Science:

  • Statistics
  • Biostatistics
  • Machine Learning

Background:

  • Ultra-high dimensional data presents challenges for traditional statistical models.
  • Missing data and measurement errors in covariates can bias regression results.
  • Quantile regression is valuable for understanding conditional distributions beyond the mean.

Purpose of the Study:

  • To develop a robust variable selection method for ultra-high dimensional quantile regression.
  • To address and correct for bias introduced by missing data and measurement errors.
  • To achieve simultaneous variable selection and parameter estimation.

Main Methods:

  • Orthogonal quantile regression to correct measurement error bias.
  • Inverse probability weighting to handle missing data.
  • Nonconvex Atan penalized estimation for variable selection and estimation.

Main Results:

  • The proposed method achieves oracle properties under relaxed conditions.
  • Demonstrated effectiveness through Monte Carlo simulations.
  • Successful application to a real-world breast cancer dataset.

Conclusions:

  • The developed procedure offers a reliable solution for variable selection in complex high-dimensional settings.
  • The method effectively handles both missing data and measurement errors.
  • Provides a valuable tool for analyzing large-scale biological and medical datasets.