Loading [MathJax]/jax/output/SVG/jax.js

The MSE loss

Nov 04, 2018

The mean squared error loss quantifies the error between a target variable y and an estimate ˆy for its value.

This loss function is defined as the mean of the squares of the individual losses between each components of y and ˆy. Let n be the length of the vector y.

LMSE(y,ˆy)=1nni=1(yiˆyi)2

The sum in the definition above is equal to the squared euclidean norm, so we can rewrite the definition as:

LMSE(y,ˆy)=1nyˆy22

The difference vector is often named the residual and noted ϵ. It is the error vector between y and the predicted value ˆy:

ϵ=yˆy

Using this vocabulary, the mean squared error loss is the squared norm of the residual, with a correction factor to account for the dimension:

LMSE(y,ˆy)=1nϵ22.

This can be visualized on the graph below.

TODO: make the graph.

The machine-learning notation

In our linear regression articles, we usually note the MSE loss like this:

LMSE(Strain,w)

Where Strain is the trainset and w is some vector of parameters.

In this situation, the vector y is the output vector corresponding to our trainset. The estimator ˆy is the predicted value:

ˆy=Xw

where X is the design matrix.

So we have:

LMSE(Strain,w)=LMSE(y,Xw).