The normal equations arise in several branches of mathematics, from statistics to geometry. In this article, we discuss how they emerge and how to solve them.

## Emergence of the normal equations

- The normal equations define the orthogonal projection of a vector onto a linear subspace.
- They equivalently define the vector that minimizes the distance between a vector and a linear subspace (see: the Moore-Penrose solution).
- They arise while solving the linear least square regression.
- They equivalently arise while solving the maximum likehood estimator for a linear regression with Gaussian error.

## The normal equations

Let be a matrix and a vector. The normal equations are writen in matrix form as follow:

As stated in the previous section, a unique solution to these equations is at the same time:

- the Moore-Penrose solution to: ;
- and the orthogonal projection of onto .

## Solving the normal equations

When the matrix has rank , it is invertible and the normal equations admit a unique solution expessed using the Moore-Penrose inverse of :

When has rank < , the normal equations form an underdetermined system and several solutions exists. As discussed in the article about the Moore-Penrose inverse, we can use an optimization algorithm such as gradient descent or stochastic gradient descent to find one numerically, or remove some columns from to reduce .