Search research articles

ABOUT JoVE

Overview Leadership Blog JoVE Help Center

AUTHORS

Publishing Process Editorial Board Scope & Policies Peer Review FAQ Submit

LIBRARIANS

Testimonials Subscriptions Access Resources Library Advisory Board FAQ

RESEARCH

JoVE Journal Methods Collections JoVE Encyclopedia of Experiments Archive

EDUCATION

JoVE Core JoVE Business JoVE Science Education JoVE Lab Manual Faculty Resource Center Faculty Site

Terms & Conditions of Use

Related Concept Videos

Residuals and Least-Squares Property

Residuals and Least-Squares Property

The vertical distance between the actual value of y and the estimated value of y. In other words, it measures the vertical distance between the actual data point and the predicted point on the line
If the observed data point lies above the line, the residual is positive, and the line underestimates the actual data value for y. If the observed data point lies below the line, the residual is negative, and the line overestimates the actual data value for y.
The process of fitting the best-fit...

Scatter Plot

Scatter Plot

The most common and easiest way to display the relationship between two variables, x and y, is a scatter plot. A scatter plot shows the direction of a relationship between the variables. A clear direction happens when there is either:

Microsoft Excel: Regression Analysis

Microsoft Excel: Regression Analysis

Regression analysis in Microsoft Excel is a powerful statistical method for examining the relationship between a dependent variable and one or more independent variables. It's used extensively in fields such as economics, biology, and business to predict outcomes, understand relationships, and make data-driven decisions. The most common type is linear regression, which attempts to fit a straight line through the data points to model the relationship between variables.
To perform regression...

Regression Analysis

Regression Analysis

Regression analysis is a statistical tool that describes a mathematical relationship between a dependent variable and one or more independent variables.
In regression analysis, a regression equation is determined based on the line of best fit– a line that best fits the data points plotted in a graph. This line is also called the regression line. The algebraic equation for the regression line is called the regression equation. It is represented as:

Correlation and Regression

Correlation and Regression

In statistics, correlation describes the degree of association between two variables. In the subfield of linear regression, correlation is mathematically expressed by the correlation coefficient, which describes the strength and direction of the relationship between two variables. The coefficient is symbolically represented by 'r' and ranges from -1 to +1. A positive value indicates a positive correlation where the two variables move in the same direction. A negative value suggests a...

Outliers and Influential Points

Outliers and Influential Points

An outlier is an observation of data that does not fit the rest of the data. It is sometimes called an extreme value. When you graph an outlier, it will appear not to fit the pattern of the graph. Some outliers are due to mistakes (for example, writing down 50 instead of 500), while others may indicate that something unusual is happening. Outliers are present far from the least squares line in the vertical direction. They have large "errors," where the "error" or residual is the...

You might also read

Related Articles

Articles linked to this work by shared authors, journal, and citation graph.

Sort by

Same author

Response to Letter to the Editor: Compliance With Recommendations of the Surveillance, Epidemiology, and End Results (SEER) Treatment Data Use Agreement: A Review of Published Studies.

Medical care·2026

Same author

Spatiotemporal Modeling Approach to Mapping Geographic and Temporal Variation in Cancer Incidence Rates for US Counties.

Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology·2026

Same author

Age-standardization in health statistics - history and future perspectives.

Journal of epidemiology·2026

Same author

Extended Joinpoint Regression Methodology for Complex Survey Data.

Statistics in medicine·2026

Same author

Trends in Smoldering Myeloma Incidence in the United States From Cancer Registries, 2012-2022.

American journal of hematology·2026

Same author

Compliance With Recommendations of the Surveillance, Epidemiology, and End Results (SEER) Treatment Data Use Agreement: A Review of Published Studies.

Medical care·2025

Same journal

Interpretable Bayesian Modeling for Multireader Multicase Studies: Addressing Overdispersion and Limited Sample Size in Diagnostic Enhancement Evaluation.

Statistics in medicine·2026

Same journal

Adaptive Sequential Multiple Hypotheses Testing for Concomitant Vaccine Safety Surveillance.

Statistics in medicine·2026

Same journal

Novel Distance Regression for Repeated Outcomes With Missing Data: Applications to Longitudinal and Crossover Studies of Microbiome Beta-Diversity.

Statistics in medicine·2026

Same journal

Optimal Weighted Tests for Replication Studies and the 'Two-Trials Rule' With Multiple Hypotheses.

Statistics in medicine·2026

Same journal

Identifiable Copula-Double-Cox Models: A Fully Parametric Framework for Dependent Right-Censored Survival Data.

Statistics in medicine·2026

Same journal

Moving From Individualized Risk-Based Prevention to Benefit-Based Prevention: Estimating Individualized Life-Years Gained From Prevention Services as a Basis for Eligibility.

Statistics in medicine·2026

See all related articles

Search research articles

Related Experiment Video

Updated: Apr 28, 2026

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

Clustering of trend data using joinpoint regression models.

Hyune-Ju Kim¹, Jun Luo, Jeankyung Kim

¹Department of Mathematics, Syracuse University, Syracuse, NY, 13244, U.S.A.

Statistics in Medicine

|June 5, 2014

Summary

This summary is machine-generated.

This study introduces a novel clustering method for piecewise linear data, identifying common characteristics like slopes. The approach utilizes restricted least squares and information criteria for effective data segmentation and cluster determination.

Keywords:

Bayes information criterion clustering joinpoint regression minimum distance worth detecting permutation test

More Related Videos

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Related Experiment Videos

Last Updated: Apr 28, 2026

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

A Method of Trigonometric Modelling of Seasonal Variation Demonstrated with Multiple Sclerosis Relapse Data

Published on: December 9, 2015

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Large-scale Reconstructions and Independent, Unbiased Clustering Based on Morphological Metrics to Classify Neurons in Selective Populations

Published on: February 15, 2017

Area of Science:

Statistics
Data Mining
Computational Statistics

Background:

Clustering two-dimensional data with piecewise linear mean functions presents challenges in identifying common characteristics.
Existing methods may not effectively handle segmented regression models with shared features across clusters.

Purpose of the Study:

To develop and validate a robust method for clustering two-dimensional data with piecewise linear mean functions.
To identify clusters exhibiting common characteristics, such as identical slopes, using segmented line regression models.

Main Methods:

Employs a restricted least squares method to fit segmented line regression models with common features.
Estimates the maximum number of segments using permutation tests and the Bayes Information Criterion (BIC).
Determines the optimal number of clusters using the Bayes Information Criterion (BIC).

Main Results:

Proposes a measure for minimum detectable distance to enhance clustering algorithm effectiveness.
Simulation results demonstrate the properties and effectiveness of the proposed clustering methods.
Proves the consistency of cluster grouping estimation for a given number of clusters.

Conclusions:

The proposed method effectively clusters two-dimensional data with piecewise linear mean functions.
The approach is applicable to both ordered and unordered independent variables, with a focus on cancer trend analysis.
The method provides a reliable framework for identifying clusters with shared statistical properties.