In this article, I show that the normal equations define the orthogonal projection of a vector onto a linear subspace.
Setup
Let S be a vector space and X⊆S a linear subspace. Let X be a matrix whose column vectors are a basis for X, i.e. X=Mspan(X). Each vector in X can be expressed as a linear combination of the column vectors in X:
∀→v∈X,∃→w:→v=X→wThe orthogonal projection
Given a vector →y∈S, the orthogonal projection of →y onto X is the vector →y∥ in X such that the residual →y⊥=→y−→y// is orthogonal to every vector in X:
- (→y∥)∈X,
- →y=(→y∥)+(→y⊥),
- ∀w∈S,(→y⊥)⊥(X→w).
Solving the equations
A vector →v is orthogonal to every vector in X if and only if X⊤→v=0, hence we are looking to solve the following equation:
X⊤(→y⊥)=0⟺X⊤(→y−→y∥)=0Since (→y∥)∈X, we can find a vector →w∈S such that:
→y∥=X→wSo the equation to be solved is:
X⊤(→y−X→w)=0Which is the matrix form of the normal equations.