Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Regression Toward the Mean

Regression Toward the Mean

Regression toward the mean (“RTM”) is a phenomenon in which extremely high or low values—for example, and individual’s blood pressure at a particular moment—appear closer to a group’s average upon remeasuring. Although this statistical peculiarity is the result of random error and chance, it has been problematic across various medical, scientific, financial and psychological applications. In particular, RTM, if not taken into account, can interfere when...

Errors In Hypothesis Tests

Errors In Hypothesis Tests

When performing a hypothesis test, there are four possible outcomes depending on the actual truth (or falseness) of the null hypothesis and the decision to reject or not.

Expected Frequencies in Goodness-of-Fit Tests

Expected Frequencies in Goodness-of-Fit Tests

A goodness-of-fit test is conducted to determine whether the observed frequency values are statistically similar to the frequencies expected for the dataset. Suppose the expected frequencies for a dataset are equal such as when predicting the frequency of any number appearing when casting a die. In that case, the expected frequency is the ratio of the total number of observations (n) to the number of categories (k).

Goodness-of-Fit Test

Goodness-of-Fit Test

The goodness-of-fit test is a type of hypothesis test which determines whether the data "fits" a particular distribution. For example, one may suspect that some anonymous data may fit a binomial distribution. A chi-square test (meaning the distribution for the hypothesis test is chi-square) can be used to determine if there is a fit. The null and alternative hypotheses may be written in sentences or stated as equations or inequalities. The test statistic for a goodness-of-fit test is given as...

Truncation in Survival Analysis

Truncation in Survival Analysis

Truncation in survival analysis refers to the exclusion of individuals or events from the dataset based on specific criteria related to the time of the event. This exclusion can happen in two primary forms: left truncation and right truncation.
Left truncation occurs when individuals who experienced the event of interest before a certain time are not included in the study. This is often due to a "delayed entry" into the study where only those who survive until a certain entry point are...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

A robust likelihood approach to inference for paired multiple binary endpoints data.

Journal of applied statistics·2024

Same author

Testing for association between ordinal traits and genetic variants in pedigree-structured samples by collapsing and kernel methods.

The international journal of biostatistics·2023

Same author

A model selection criterion for clustered survival analysis with informative cluster size.

Pharmaceutical statistics·2022

Same author

Associating Multivariate Traits with Genetic Variants Using Collapsing and Kernel Methods with Pedigree- or Population-Based Studies.

Computational and mathematical methods in medicine·2021

Same author

Combining dependent p-values by gamma distributions.

Statistical applications in genetics and molecular biology·2020

Same author

A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.

PloS one·2020

Same journal

Elastic functional Cox regression model with shape predictors.

Journal of applied statistics·2026

Same journal

An improved two-stage binary relevance method for multilabel classification.

Journal of applied statistics·2026

Same journal

Classification of multivariate functional data with an application to ADHD fMRI data.

Journal of applied statistics·2026

Same journal

Assessing the performance of longitudinal T-lymphocytes as biomarkers of immune recovery in HIV-infected children with or without TB co-infection.

Journal of applied statistics·2026

Same journal

Sparse long-only Markowitz portfolio optimization.

Journal of applied statistics·2026

Same journal

Homogeneity of multinomial populations when data are classified into a large number of groups.

Journal of applied statistics·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Jun 25, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Regression Diagnostic under Model Misspecification.

Li-Chu Chien¹, Tsung-Shan Tsou²

¹Division of Biostatistics and Bioinformatics, National Health Research Institutes, Taiwan.

Journal of Applied Statistics

|May 31, 2024

Summary

This summary is machine-generated.

We introduce new methods to detect influential data points in linear regression. These diagnostics identify observations significantly altering the likelihood function, improving regression analysis reliability.

Keywords:

Cook's distance DFBETAS DFFITS Influential diagnostic robust likelihood robust normal regression

More Related Videos

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Related Experiment Videos

Last Updated: Jun 25, 2025

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Development of an Individual-Tree Basal Area Increment Model using a Linear Mixed-Effects Approach

Published on: July 3, 2020

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Establishing a Competing Risk Regression Nomogram Model for Survival Data

Published on: October 23, 2020

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Probing the Limits of Egg Recognition Using Egg Rejection Experiments Along Phenotypic Gradients

Published on: August 22, 2018

Area of Science:

Statistics
Econometrics
Data Science

Background:

Traditional regression diagnostics assess data point influence by deletion effects.
Existing methods focus on parameter estimates or predicted values.
Limitations exist in current diagnostics for identifying subtle influential observations.

Purpose of the Study:

To propose novel diagnostic measures for influential observations in linear regression.
To offer alternative methods beyond deletion diagnostics.
To enhance the robustness of regression parameter estimation.

Main Methods:

Developed two new diagnostic statistics for influential observations.
Focused on the impact of data point inclusion on the likelihood function.
Utilized asymptotic properties for broad applicability.

Main Results:

The proposed methods identify influential observations based on likelihood function changes.
These diagnostics are asymptotically valid for distributions with existing second moments.
Offers a new perspective on detecting influential data points in regression.

Conclusions:

The novel diagnostics provide a valuable tool for assessing influential observations in linear regression.
These methods complement existing diagnostics by focusing on likelihood.
Enhances the reliability and interpretability of regression models.