Processing math: 100%

What is ridge regression?

Nov 07, 2018

A ridge regression is a least-squares regression that uses L2-regularization.

The regularized loss function on some dataset S is thus:

Lridge(S,w)=LMSE(S,w)+λw22

where λ is an hyper-parameter that controls the importance of the regularization.

For a complete discussion about the effect of L2-regularization on the parameters of the model, check out our dedicated article: L2-regularization.

Common mistake: it is important to normalize the features before using regularization. Failure to do so will yield incoherent regularization behavior.

Analytical solution

Let Strain be the training-set and note X and y the corresponding design matrix and output vector.

We can compute the value of the parameter vector that minimizes the regularized loss using differentiation.

  L(Strain,w)=1|Strain|X(yXw)
  λw2=2λw

Setting all directional derivatives to 0, we get:

ˆwStrain=[XX+(2λ|Strain|)I]1Xy