Mean, median, and Mode; Measuring Central Tendency and Skew in Bell Curves
The term central tendency is useful in assigning a single number to describe the quantitative size and distribution of many numbers, or pieces of raw data. This may be a pool of data concerning a number of test scores, many different people's height, or daily temperature. This single number can describe the most probable or, "average," quantitative measure of such a large data pool by considering the data in three possible ways. All three of these processes result in a different type of central tendency; mean, median, and mode.
Mean, Median, Mode
A central tendency described by the mean is the arithmetic average of the pieces of data. For example if my weekly hours are recorded as;
43, 41, 50, 37, 46, 43, 40, 49, 45, 42, 49, 43 for every week of the last fiscal quarter then my mean central tendency is found by adding each pieces of data together and then dividing by the total number of data pieces.
Using the formula, mean=x1+x2+x3+X4....X12/12, where each x value corresponds to a piece of raw data. In this case,
Median works well to eliminate the skewing effects of out-liers (abnormally high or low piece of data)
To find a median simply line up all data from low to high, for our example;
You then simply count toward the exact middle of the series, if we scratch one score from each end we get,
and then average these two number; (43+43)/2=43
Had there been an odd number of pieces of raw data then we would land on an exact middle number and the averaging process would be unnecessary.
Had the two middle numbers in an even numbered set been different, for example 43 and 44, then we would average then using the same arithmetic process and would have 43.5 as our median.(43+44)/2=43.5)
Mode is another measure of central tendency that helps to minimize a positive or negative skew on the data that may result from a few out-liers at either end of the data pool. The mode is simply the piece of data that occurs with the most frequency.
In our set of data;
37,40,41,42,43,43,43,45,46,49,49,50 we can simply eye-ball the mode as 43 since it occurs three times and aside from 49 occurring twice all other pieces of discrete data occur with a frequency of one. In a larger pool of data you may have to do a tally or construct a frequency distribution table;
Notice the number of your hashes should equal the total number of pieces of data, otherwise you've missed something.
Gaussian Curves, and Positive and Negative Skew
A normal distribution is often depicted as a Gaussian Curve, otherwise known as a bell curve in reference to it's shape. In the instance of a normally distributed Gaussian curve absent of skew (see image at top) all measures of central tendency, mean median, and mode are exactly the same. This is how your would expect data measuring a construct like scores on an achievement test to randomly.fall along a spectrum with most scores falling near the central tendency.
In a measure of central tendency measuring a less organic distribution, say average salary within a corporation you can expect a distribution that is negatively skewed with three measures of central tendency that have three different values (see middle image). In this case using the mean is misleading because the salaries of those on the corporate board of directors will inflate the mean and not give an accurate picture of what the typical employee is making. A corporation may report in a PR campaign that they pay their employees a mean salary of X amount, where this may indeed be the average but it doesn't represent the fact that the majority of employees earn less.
A positive skew is the same idea in reverse. (see bottom image). If the Modal length all the fish you caught is three feet, this may represent that you only caught two fishes that were three fish while neglecting to mention the other ten six inch to 12 inch fish caught. They many small fish necessarily drag the mean down and make for a less impressive fishing story.
These are a few ways that measures of central tenancy are used to mislead and slant data intended to support an agenda.