In machine learning, the best parameters for a model are chosen so as to minimize the training objective. Strictly convex functions are paticularly interesting because they have a unique global minimum.
Furthermore, for strict and non-sctrict convex functions, every local minimum is a global minimum.
Visually, a convex function “curves up”, without any bends the other way.
What is convexity?
A function is convex if and only if a segment joining two points on its curve always stays above the curve. ∀0≤t≤1:
f(t→a+(1−t)→b)≤tf(→a)+(1−t)f(→b)The function is strictly convex when the inequality is strict.
Caracterization of convex functions
- Sum of convex functions are also convex.
- A differentiable function of one variable is convex on an interval iff it lies above all of its tangents: f(x)≥f(y)+f′(y)(x−y).
- A differentiable function of several variables is convex on a compact iff it lies above its linearization: f(v)≥f(→w)+∇f(→w)⊤(v−→w)
- A twice differentiable function of one variable is convex on an interval iff its second derivative is non-negative.