Learning curve (machine learning)

Last revised by Andrew Murphy on 18 Apr 2021

A learning curve is a plot of the learning performance of a machine learning model (usually measured as loss or accuracy) over time (usually in a number of epochs).

Learning curves are a widely used diagnostic tool in machine learning to get an overview of the learning and generalization behavior of machine learning models and diagnose potential problems.

They should be evaluated both on the training dataset, to give an idea of how well the model is “learning”, and the validation dataset, to give an idea of how well the model is “generalizing.”

The shape and dynamics of a learning curve can be used to characterize the learning behavior of a machine learning model and can in turn suggest the type of configuration changes that may be made to improve learning and/or performance. There are three common patterns that you are likely to observe in learning curves:

  • underfitting: better performance on the validation set than on the training set; the network is struggling with learning the patterns in the data
  • good fit: the performance on the validation set is on par with that on the training set, albeit lower
  • overfitting: high performance on a training set, significantly lower performance on a validation set; the network memorizes the training samples instead of learning the patterns, i.e. struggles to generalize

Let’s keep in mind that some degree of overfitting is common even with effective models and we can only try to minimize this effect.

There are many different techniques that can be implemented to correct any under- or overfitting tendencies of a machine learning model. More information about these can be found on the respective articles about underfitting and overfitting.

ADVERTISEMENT: Supporters see fewer/no ads

Updating… Please wait.

 Unable to process the form. Check for errors and try again.

 Thank you for updating your details.