## Friday, April 20, 2012

### Finding the median value using MySQL sql

The median of a finite list of numbers is defined as the number in the middle of the list when the list is sorted, or the average (arithmetic mean) of the two numbers in the middle if the size of the list is even.

Since, as I described in my previous post, averages are meaningless when it comes to something like marks, I've decided to start using medians instead since it would be more useful to know in which half of the class you stand with your mark, and medians are much more intuitive than means.

Here is the SQL to find the median in MySQL (adapted from http://stackoverflow.com/a/7263925)

SET @rownum := -1;

SELECT
AVG(t.mark)
FROM
(
SELECT
@rownum := @rownum + 1 AS rownum,
results.mark AS mark
FROM
results
ORDER BY results.mark
) AS t
WHERE
t.rownum IN (
CEIL(@rownum/2),
FLOOR(@rownum/2)
)
;


Here's a brief explanation:

In MySQL we cannot use expressions in LIMIT, so instead we have to simulate a LIMIT by using a counter with user defined variables. Each row in the table will have a row number associated with it starting from 0 in the variable @rownum.

We then choose from these numbered rows the ceiling and floor of "(number of rows - 1)/2" which will be the same middle value if the number of rows is odd or the middle two values if the number of rows is even. The variable will still retain its value after all the rows are numbered so we can use it to know the number of rows.

Since the average of one number is the number itself, we can simply calculate the average of whatever is returned without checking. If the number of rows is odd then we shall find the average of the middle value which will remain the same and if the number of rows is even then we shall find the average of the middle two values will will be the correct median.

Don't forget that if you're using this in PHP you have to use two "mysql_query" calls, one for the first SET and another for the SELECT as you cannot use ";" in queries.