Principle Component Analysis is a Latent Space visualization technique for linear pattern extraction and dimensionality reduction. It transforms high-dimensional data into a lower-dimensional space while preserving as much variance as possible.

For a two dimensional example mapping onto a line, you can think of PCA as finding the best line to fold the data onto, such that the distance from the points to the line is minimized. That β€œline” is the largest principle component of the distribution:

In the context of linear algebra, the Principle Component can be thought of as the eigenvector corresponding to the largest eigenvalue of the data’s covariance matrix. This eigenvector points in the direction of maximum variance in the data.


Computation

The first step in computing Principle Component Analysis is to find the covariance matrix.

This can be calculated by first centering your data, this means calculating the mean of every variable and then substracting that mean from each datapoint

And then calculating the Covariance matrix: