Feature scaling

Last revised by Henry Knipe on 20 May 2019

Feature scaling a preprocessing technique that is used to standardize the range of values in data features, making sure that the features are on a similar scale. It is used when the range of values of a certain feature is too variable and contains extreme values as most algorithms perform poorly when subjected to raw data which hasn’t been processed. Feature scaling can also allow algorithms such as gradient descent to operate much faster.

Feature scaling typically involves dividing every value in a data set by the range of values (maximum value - minimum value), resulting in a new range of just 1.

Mean normalization

Feature scaling is often combined with mean normalization, which subtracts every value in a data set by the mean, resulting in a new mean of 0.  

Combining feature scaling and mean normalization, we achieve a new feature z:

  • z = (x - μ) / s

where μ is the mean of the value, s is the range of values, and x is the original feature.

ADVERTISEMENT: Supporters see fewer/no ads