Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Residuals and Least-Squares Property

Residuals and Least-Squares Property

The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...

Multiple Regression

Multiple Regression

Multiple regression assesses a linear relationship between one response or dependent variable and two or more independent variables. It has many practical applications.
Farmers can use multiple regression to determine the crop yield based on more than one factor, such as water availability, fertilizer, soil properties, etc. Here, the crop yield is the response or dependent variable as it depends on the other independent variables. The analysis requires the construction of a scatter plot...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Correlation and Regression

Correlation and Regression

In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Test-Negative Designs With Multiple Testing Sources.

Statistics in medicine·2026

Same author

Constructing a Literature-Derived Database for Benchmarking Polygenic Risk Score Construction Methods with Spectral Ranking Inferences.

medRxiv : the preprint server for health sciences·2026

Same author

Test-negative Designs with Various Reasons for Testing: Statistical Bias and Solution.

Epidemiology (Cambridge, Mass.)·2025

Same author

Anomalous Saturation of CO Adsorption at 26% on Cu(111) Governed by Nanometer-Scale Substrate-Mediated Interactions.

Journal of the American Chemical Society·2025

Same author

Test-Negative Designs with Multiple Testing Sources.

medRxiv : the preprint server for health sciences·2025

Same author

Test-Negative Designs with Multiple Testing Sources.

Research square·2025

Same journal

Instrumental Variable Estimation of Marginal Structural Mean Models for Time-Varying Treatment.

Journal of the American Statistical Association·2026

Same journal

Semiparametric Joint Modeling for Survival Analysis with Longitudinal Covariates.

Journal of the American Statistical Association·2026

Same journal

Dimension Reduction for Large-Scale Federated Data: Statistical Rate and Asymptotic Inference.

Journal of the American Statistical Association·2026

Same journal

Facilitating Heterogeneous Effect Estimation via Statistically Efficient Categorical Modifiers.

Journal of the American Statistical Association·2026

Same journal

Nonparametric Density Estimation of a Long-Term Trend from Repeated Semicontinuous Data.

Journal of the American Statistical Association·2026

Same journal

Functional Integrative Bayesian Analysis of High-dimensional Multiplatform Clinicogenomic Data.

Journal of the American Statistical Association·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 13, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Are Latent Factor Regression and Sparse Regression Adequate?

Jianqing Fan¹, Zhipeng Lou², Mengxin Yu³

¹Frederick L. Moore '18 Professor of Finance, Professor of Statistics, and Professor of Operations Research and Financial Engineering at the Princeton University.

Journal of the American Statistical Association

|September 13, 2024

Summary

This summary is machine-generated.

We introduce the Factor Augmented Regression Model (FARM), unifying dimension reduction and sparse regression. Our model and tests demonstrate robustness and effectiveness for high-dimensional data analysis.

Keywords:

Factor model High-dimensional Inference Hypothesis Robustness Sparse linear regression

More Related Videos

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Published on: September 17, 2019

Related Experiment Videos

Last Updated: Jun 13, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Applying an eMASS Customization Program as a Research Tool to Evaluate Consumer Benefits

Published on: September 27, 2019

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Using Cholesky Decomposition to Explore Individual Differences in Longitudinal Relations between Reading Skills

Published on: September 17, 2019

Area of Science:

Statistics
Econometrics
Machine Learning

Background:

Existing supervised learning models often assume either latent factor regression or sparse linear regression without validation.
A gap exists in high-dimensional inference for testing the adequacy of these underlying models.

Purpose of the Study:

To propose a novel Factor Augmented (sparse linear) Regression Model (FARM) that integrates dimension reduction and sparse regression.
To develop theoretical guarantees for FARM estimation under various noise conditions.
To introduce methods for testing the sufficiency of latent factor and sparse linear regression models.

Main Methods:

Factor Augmented (sparse linear) Regression Model (FARM) formulation.
Theoretical analysis for model estimation with sub-Gaussian and heavy-tailed noises.
Factor-Adjusted deBiased Test (FabTest) and a two-stage ANOVA type test for model adequacy.

Main Results:

Theoretical guarantees established for FARM estimation under diverse noise distributions.
The proposed tests effectively assess the sufficiency of latent factor and sparse linear regression models.
Numerical experiments confirm FARM's robustness and effectiveness compared to existing models.

Conclusions:

FARM offers a unified framework for dimension reduction and sparse regression.
The developed tests provide crucial tools for model selection in high-dimensional settings.
The proposed methods demonstrate superior performance and robustness in empirical evaluations.