Processing math: 100%

A probability distribution to quantify measurement errors

Apr 03, 2018

In this article we will derive the normal distribution as the probability distribution that models measurement errors. We start with a dart game and follow Herschel’s derivation.

Suppose we are playing a dart game as shown in the figure below. What is the probability that a dart lands at a given position on the boart?

Dart game

We can think of this problem as measurement errors. The goal is to measure a quantity (here: the position of the bull’s eye), but there is noise in the measurement device (here: your aim is not perfect) so we can only obtain values “polluted” by some variable, random noise.

The quality of the device (here: the quality of the player) can be measured by the spread of the measured values. For instance, the figure below illustrates the measurements made by two devices. The device on the left hand side has more precision (= the player is more skilled) than on the right hand side.

Dart board

Let’s note

p(x,y)MdxMdy

the probability that the dart lands on an infinitesimal surface area MdxMdy located at (x,y).

We can express this probability in polar coordinates too:

q(r,θ)rMdrMsinθ=p(x,y)MdxMdy

Whatever the quality of the measurement device (or the skills of the player), it is reasonable to expect that small errors are more frequent than big ones. So the probability should decrease when the distance to the center increases. This means f is a decreasing function of r.

Also, there is no reason to expect that measurements will land more often on the left of the bull’s eye than on the right. More generally and taking the symmetry into account, every location on a circle centered on the bull’s eye should have the same probability to be hit. This mean that f does not depend on the rotation angle θ:

q(r,θ)=q(r)

We can arbitrarily choose two orthogonal directions and fix a cartesian axis on the dart board. This is illustrated on the picture below.

Dart board

If we further assume that knowing x doesn’t tell us anything about y and vis versa, then we have the following additional condition on the form of p(x,y):

p(x,y)MdxMdy=f(x)Mdxf(y)Mdy

Since r=x2+y2 we get:

q(x2+y2)=f(x)f(y)

Setting y=0, this becomes:

q(|x|)=f(x)f(0)q(x2+y2)=f(x2+y2)f(0)f(x)f(y)=f(x2+y2)f(0)

Solving this equation, we find that:

f(x)=απMeαx2(α>0)

with one undetermined parameter α>0. If we compute the variance σ2 of f, we find that σ2=12α. So we can rewrite the formula for f using σ2:

f(x)=12σ2πMex22σ2

This formula is the gaussian distribution or normal distribution, noted N(0,σ2). As explained above, it is the probability distributions of errors when measuring a target value of value 0. Here is a plot for two values of σ2:

Gaussian distribution

What if the target value μ is not 0? We can simply operate a change of axis and translate everything by μ. The translated random variable will thus represent a measurement ϵ of 0 and can be expressed with the normal distribution:

Yμ=ϵN(0,σ2)

If we replace in the formula for f we find the general formula for the normal distribution:

YN(μ,σ2)fY(x)=12σ2πMe(xμ)22σ2

But what is the variance σ2? It is a parameter that controls the quality of our measurement device. Recall this picture where on the left hand side the variance is much smaller than on the right hand side.

Dart board

Here is a video summary of this derivation of the normal distribution (called the Herschel’s derivation):

The normal distribution can be seen as a random number generator that generates measurement of a target value μ with embedded measurement error ϵ.

Given multiple measurements, can we retrieve the target value μ? Check out my next article to answer this question here