Home

Means, medians, and modes

They're more closely related than they look

This page assumes that you know some elementary calculus and statistics.

The "big three" measures of central tendency are the mean, the median, and the mode. On the face of it, they have nothing in common. They are based on quite different properties of the data. To get the mean, you need to do arithmetic on the data, to get the median, all you have to do is rank the data, and to get the mode, all you need to be able to do is distinguish between different data values.

Another way to put it is that calculation of the mean requires data that is on an interval or ratio scale, calculating the median only needs data on an ordinal scale, and calculating the mode only needs data on a nominal scale.

However, it turns out that these three measures have a strong family connection - they are parametric variants of a single measure. Consider some data xi, and some quantity X. We can calculate a set of deviations xi - X. Now we will find X such that the sum of the unsigned deviations, each raised to a certain power p, is minimised.

If p is 2, then X gives us the mean.
If p is 1, then X gives us the median.
If p is 0, then X gives us the mode.

This is one of the most beautiful maths results I have come across. For me, it is all the more gorgeous because I came upon it by myself (though it's not a new result). The proof is not difficult. We start with the mean:

The calculation showing that the mean minimises the sum of squared deviations

Now for the median. Because a deviation raised to the power 1 can be positive or negative according to whether xi > X or not, we have to split up the sum:

The calculation showing that the median minimises the sum of unsigned deviations

Finally, the mode:

The calculation showing that the mode minimises the sum of deviations raised to the power zero

So we see that the mean, median, and mode differ only in the power p to which we raise the unsigned deviations before finding the value of X that minimises their sum. I think that this unity of apparently unconnected things is very lovely.

And onwards...

This immediately brings up the prospect of an indefinite number of alternative measures of central tendency, differing only in the value of p. What would be the meaning of a measure for which p was 1.5, ie a hybrid of the mean and the median? I haven't given that much thought yet. But one possibility that I have investigated is: what happens with values of p greater than 2? The obvious one to try is p=3, but we immediately run into trouble because, as p is odd, the exponentiated deviations can be positive or negative, and it's not easy to split up the calculation into parts, as we did for the case p=1 (the median).

Nevertheless, I went ahead anyway using signed deviations raised to the power 3. Of course, finding the zero of the derivative of the sum no longer finds a local minimum of the sum of exponentiated deviations; if we choose a very small value of X we can make the sum as big as we like, and if we choose a very large value, we can make the sum as small as we like. What we are finding is a stationary point on the function linking the sum to X. Still, it was worth a look, and the answer is intriguing.

The calculation with p = 3, which gives X = muplus or minus i sigma

The two most famous symbols in statistics, combined in a very familiar way. It fair sends a shiver up your spine. This answer is so neat that it must mean something. But what? I can't work it out.