Jove
Visualize
Contact Us
JoVE
x logofacebook logolinkedin logoyoutube logo
ABOUT JoVE
OverviewLeadershipBlogJoVE Help Center
AUTHORS
Publishing ProcessEditorial BoardScope & PoliciesPeer ReviewFAQSubmit
LIBRARIANS
TestimonialsSubscriptionsAccessResourcesLibrary Advisory BoardFAQ
RESEARCH
JoVE JournalMethods CollectionsJoVE Encyclopedia of ExperimentsArchive
EDUCATION
JoVE CoreJoVE BusinessJoVE Science EducationJoVE Lab ManualFaculty Resource CenterFaculty Site
Terms & Conditions of Use
Privacy Policy
Policies

Related Concept Videos

Variation01:19

Variation

An important characteristic of any set of data is the variation in the data. In some data sets, the data values are concentrated closely near the mean; in other data sets, the data values are more widely spread out from the mean. The most common measure of variation, or spread, is the standard deviation, which is the square root of variance.
When independent and dependent variables are plotted on a scatter plot, the slope of a line is a value that describes the rate of change between the two...
Friedman Two-way Analysis of Variance by Ranks01:21

Friedman Two-way Analysis of Variance by Ranks

Friedman's Two-Way Analysis of Variance by Ranks is a nonparametric test designed to identify differences across multiple test attempts when traditional assumptions of normality and equal variances do not apply. Unlike conventional ANOVA, which requires normally distributed data with equal variances, Friedman's test is ideal for ordinal or non-normally distributed data, making it particularly useful for analyzing dependent samples, such as matched subjects over time or repeated measures from...
Regression Toward the Mean01:52

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when researchers try to extrapolate results...
Multiple Regression01:25

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...
Coefficient of Variation01:10

Coefficient of Variation

The coefficient of variation measures the dispersion of the data points or distribution around the mean. Using the coefficient of variation, we can compare two data series with drastically different means or different units of measurement. The coefficient of variation for a sample and a population is expressed as a percentage of the ratio of standard deviation to the mean.
The coefficient of variation is a practical statistical tool in finance. It allows investors to assess the volatility or...
Estimating Population Standard Deviation01:26

Estimating Population Standard Deviation

When the population standard deviation is unknown and the sample size is large, the sample standard deviation s is commonly used as a point estimate of σ. However, it can sometimes under or overestimate the population standard deviation. To overcome this drawback, confidence intervals are determined to estimate population parameters and eliminate any calculation bias accurately. However, this only applies to random samples from normally distributed populations. Knowing the sample mean and...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by
Same author

The Burden of Alzheimer's Disease and Other Dementias Attribute to Metabolic Risks in Western Europe From 1990 to 2023.

Diabetes, obesity & metabolism·2026
Same author

Antifungal effect and mechanism of magnolol against the Alternaria tenuissima causing leaf blight on Schisandra chinensis.

BMC plant biology·2026
Same author

Pressure-Driven Dimensional Modulation of Phase Transitions and Superconductivity in Black Phosphorus.

Nano letters·2026
Same author

More than ears: Neural synchrony underlies the facilitating effects of active listening on reappraisal over acceptance during interpersonal emotion regulation.

Biological psychology·2026
Same author

From deep learning-based toxicity prediction to theoretical simulation-based mechanism analysis: A case study of 33 typical pesticides and their transformation products.

Journal of hazardous materials·2026
Same author

School-Stage Differences in the Mindset-Resilience-Burnout Network: A Bayesian Network Analysis.

Journal of adolescence·2026
Same journal

Simplifying debiased inference via automatic differentiation and probabilistic programming.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Principal stratification with U-statistics under principal ignorability.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Causal K-Means Clustering.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Correction to: Inference of dependency knowledge graph for Electronic Health Records.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
Same journal

Harmonized Estimation of Subgroup-Specific Treatment Effects in Randomized Trials: The Use of External Control Data.

Journal of the Royal Statistical Society. Series B, Statistical methodology·2026
See all related articles

Related Experiment Video

Updated: May 25, 2026

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Variance estimation using refitted cross-validation in ultrahigh dimensional regression.

Jianqing Fan1, Shaojun Guo, Ning Hao

  • 1Princeton University, USA.

Journal of the Royal Statistical Society. Series B, Statistical Methodology
|February 8, 2012
PubMed
Summary
This summary is machine-generated.

Accurate variance estimation is crucial for ultrahigh dimensional linear regression. A new refitted cross-validation method effectively reduces noise underestimation caused by spurious correlations, matching oracle performance.

More Related Videos

Topographical Estimation of Visual Population Receptive Fields by fMRI
06:02

Topographical Estimation of Visual Population Receptive Fields by fMRI

Published on: February 3, 2015

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Related Experiment Videos

Last Updated: May 25, 2026

An R-Based Landscape Validation of a Competing Risk Model
05:37

An R-Based Landscape Validation of a Competing Risk Model

Published on: September 16, 2022

Topographical Estimation of Visual Population Receptive Fields by fMRI
06:02

Topographical Estimation of Visual Population Receptive Fields by fMRI

Published on: February 3, 2015

Basics of Multivariate Analysis in Neuroimaging Data
06:35

Basics of Multivariate Analysis in Neuroimaging Data

Published on: July 24, 2010

Area of Science:

  • Statistics
  • Machine Learning
  • Econometrics

Background:

  • Variance estimation is essential in statistical modeling.
  • Traditional methods fail in ultrahigh dimensional linear regression (many predictors, few samples).
  • High spurious correlations between noise and predictors cause significant underestimation of noise levels.

Purpose of the Study:

  • To develop a robust variance estimation method for ultrahigh dimensional linear regression.
  • To address the challenge of noise underestimation due to spurious correlations.
  • To improve the performance of existing estimation techniques.

Main Methods:

  • A two-stage refitted procedure using data splitting (refitted cross-validation).
  • Asymptotic analysis to evaluate the proposed method's performance.
  • Comparison with naive two-stage and plug-in one-stage estimators (LASSO, SCAD).

Main Results:

  • The proposed refitted cross-validation method effectively attenuates the influence of irrelevant variables with high spurious correlations.
  • Asymptotic results demonstrate the procedure performs comparably to the oracle estimator.
  • Simulation studies validate the theoretical findings.

Conclusions:

  • The refitted cross-validation method offers a significant improvement for variance estimation in ultrahigh dimensional settings.
  • This approach enhances the performance of existing one-stage and two-stage estimators.
  • The method provides a reliable solution to the noise underestimation problem.