The mean squared error loss quantifies the error between a target variable and an estimate for its value.
This loss function is defined as the mean of the squares of the individual losses between each components of and . Let be the length of the vector .
The sum in the definition above is equal to the squared euclidean norm, so we can rewrite the definition as:
The difference vector is often named the residual and noted . It is the error vector between and the predicted value :
Using this vocabulary, the mean squared error loss is the squared norm of the residual, with a correction factor to account for the dimension:
.
This can be visualized on the graph below.
TODO: make the graph.
The machine-learning notation
In our linear regression articles, we usually note the MSE loss like this:
Where is the trainset and is some vector of parameters.
In this situation, the vector is the output vector corresponding to our trainset. The estimator is the predicted value:
where is the design matrix.
So we have:
.