Linear regression

Case contributed by Dr Candace Makeda Moore

Presentation

Different data sets that have the same linear regresssion.

Diagram

Three different data sets and their corresponding linear regression function.

Case Discussion

This illustration shows three different data sets in different colors that all give the same linear regression function. The linear regression function is shown as a red line. The illustration shows that in spite of very different characteristics, data sets can map to the same single linear regression.

Strictly speaking, many statisticians would consider the linear regression model inapropriate for data that do not meet certain criterion including approximate linearity in the correlation of two continous variables. Here only the blue data would be truly appropriate to use a linear regression model on. 

The blue data are best represented as a true first degree polynomial and therefore match the linear regression line. The yellow data are best represented by a polynomial of a degree between zero and one. The pink data show almost no variance.

The idea that radically different data sets can have many of the same statistical characteristics including the same linear regression was elegantly illustrated by the statistician Francis John Anascombe in a quartet of data sets and graphs now called Anascombe’s quartet.

PlayAdd to Share

Case information

rID: 68900
Published: 19th Jun 2019
Last edited: 14th Aug 2019
System: Chest
Inclusion in quiz mode: Included

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.