The MSE loss

The mean squared error loss quantifies the error between a target variable and an estimate for its value.

This loss function is defined as the mean of the squares of the individual losses between each components of and . Let be the length of the vector .

The sum in the definition above is equal to the squared euclidean norm, so we can rewrite the definition as:

The difference vector is often named the residual and noted . It is the error vector between and the predicted value :

Using this vocabulary, the mean squared error loss is the squared norm of the residual, with a correction factor to account for the dimension:


This can be visualized on the graph below.

TODO: make the graph.

The machine-learning notation

In our linear regression articles, we usually note the MSE loss like this:

Where is the trainset and is some vector of parameters.

In this situation, the vector is the output vector corresponding to our trainset. The estimator is the predicted value:

where is the design matrix.

So we have: