They both fit Gaussians, just different ones! OLS fits a 1D Gaussian to the set of errors in the y coordinates only, whereas TLS (PCA) fits a 2D Gaussian to the set of all (x,y) pairs.
Yes, and if I remember correctly, you get the Gaussian because it's the minimum entropy (least additional assumptions about the shape) continuous distribution given a certain variance.
Both of these do, in a way. They just differ in which gaussian distribution they're fitting to.
And how I suppose. PCA is effectively moment matching, least squares is max likelihood. These correspond to the two ways of minimizing the Kullback Leibler divergence to or from a gaussian distribution.