What is regularization?

Nov 06, 2018

Regularization is a semi-automated method to manage overfitting. The core idea is to avoid overfitting by penalizing model complexity.

Regularization improves a model performance by decreasing variance at the cost of increasing a bit of bias but to a lesser extent.

General shape

The general shape of a regularized loss function Lreg is the following:

Lreg(S,w)=L(S,w)+λR(w)

Where S is the dataset on which we compute the loss and w is the vector of parameters for our model.

The hyper-parameter

The hyper-parameter λ allows us to control the importance of the regularization. When λ=0 the regularizer is canceled and we get the unregularized solution. When λ, we get an intercept-only model.

TODO graph the curve as lambda variates.

A word of caution

It is important to normalize the features before using regularization. Failure to do so will yield incoherent regularization behavior.

Common regularizers

The L2 norm

R2(w)=w2

An ordinary least squares regression with L2 regularization is named a ridge regression.

The L1 norm

R1(w)=w1

An ordinary least squares regression with L1 regularization is named a lasso regression.