Principal component analysis

Last revised by Candace Makeda Moore on 6 Mar 2020

Principal component analysis is a mathematical transformation that can be understood in two parts:

  1. the transformation maps multivariable data (Nold dimensions) into a new coordinate system (Nnew dimensions) with minimal loss of information.
  2. data projected on the first dimension of the new coordinate system, also known as the first principal component, has the greatest variance. Data projected on the second dimension of the new coordinate system has the second greatest variance.

PCA is useful as a feature extraction method because it can reduce complex multivariable data to fewer dimensions (e.g. 100 dimensions to 10 dimensions) without loss of important characteristic information.

ADVERTISEMENT: Supporters see fewer/no ads