Linear regression

Case contributed by Dr Candace Makeda Moore


Different data sets that have the same linear regresssion.


Three different data sets and their corresponding linear regression function.

Case Discussion

This illustration shows three different data sets in different colors that all give the same linear regression function. The linear regression function is shown as a red line. The illustration shows that in spite of very different characteristics, data sets can map to the same single linear regression.

Strictly speaking, many statisticians would consider the linear regression model inapropriate for data that do not meet certain criterion including approximate linearity in the correlation of two continous variables. Here only the blue data would be truly appropriate to use a linear regression model on. 

The blue data are best represented as a true first degree polynomial and therefore match the linear regression line. The yellow data are best represented by a polynomial of a degree between zero and one. The pink data show almost no variance.

The idea that radically different data sets can have many of the same statistical characteristics including the same linear regression was elegantly illustrated by the statistician Francis John Anascombe in a quartet of data sets and graphs now called Anascombe’s quartet.

How to use cases

You can use Radiopaedia cases in a variety of ways to help you learn and teach.

Creating your own cases is easy.

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.