The mean, also referred to by statisticians as the average, is the most common statistic used to measure the center, or middle, of a numerical data set. The mean is the sum of all the numbers divided by the total number of numbers. The mean may not be a fair representation of the data, because the average is easily influenced by outliers (very large or very small values in the data set that are not typical).
Example: suppose there is a group of people and we want to calculate the average salary. One person in the group is a billionaire and has a huge salary. Here’s the average:
average = $50K + $45K + $54K + $61K + $10,000K = ($10,210K)/5 = $2,042K
The average salary in the group is $2,042,000! Hardly accurate! The billionaire has distorted the average. For the majority of the group, the average salary is around $50,000.
The median is another way to measure the center of a numerical data set. A statistical median is much like the median of an interstate highway. On a highway, the median is the middle road, and an equal number of lanes lay on either side of the median. In a numerical data set, the median is the point at which there are an equal number of data points whose values lie above and below the median value. Thus, the median is truly the middle of the data set.
Example: in the above example, the median salary is $54K because there are two values below it ($45K and $50K) and two values above it ($61K and $10,000K).
The next time you hear an average reported, look to see whether the median is also reported. The average and the median are two different representations of the middle of a data set and can often give two very different stories about the data.
– Statistics for Dummies by Deborah Rumsey