A look at the word “average”


Related to the symptomatic overuse of percentage in modern articles is the somewhat obscure use of the word “average.” This word is used in so many instances that it seems it has blended, in a rather subtle way, into the fabric of what one would call a well-informed educated opinion. From my experience, whether from people’s speech or from articles, I have the impression that anytime some cogency is sought in an argument about people, which can be a very tricky business, a safe move, safer than blatantly using the word “most” when the totality is not even known, is to use “average.” Let us look at some examples: in this blog, it is mentioned, “It is time the average American had a voice and here it is”; on the Census Bureau website, one can read “The “average American” makes 11.7 moves in a lifetime (based upon current age structure and average rates of moving by age between 1990 and 1993).” Another one is “The average family of four can use 400 gallons of water every day, and, on average, approximately 70 percent of that water is used indoors.” Or “There is an average of 207,754 victims (age 12 or older) of sexual assault each year” (http://www.rainn.org/statistics ). If one browses the net, a rather large number of similar examples can be found.

I’ve been thinking about this phenomenon for quite some time, but it’s only recently have I had some time to actually think much more about it, which leads me to ask about the phenomenological extent of the word, or to simply put it: Does one know anything at all from this word? If so, what can be known and what can’t? This last question immediately begs for an answer for the question of the existence of any discrepancy between what one can know from it and what it is claimed to tell one.

The online Merriam Webster mentions that an early meaning of the word was a “proportionally distributed charge for damage at sea” with modified forms in other languages which carry a closely related meaning to the original. Several other meanings, which seem to be the same, can be found from Webster.

Whatever meaning one chooses to assign to the word, it is apparent that some ties to the mathematical interpretation are undeniably evident; indeed, this word has three mathematical meanings: it could be mean, mode, or median. This suggests that confusion, intentionally or not, is likely to happen when the word is used. As a result, my task will be to explore the aforementioned questions in all three cases. I start with the mean: does one know anything at all from it?

Given numbers a, b, and c, the mean of these numbers is a third of their sum. I use “a third” because there are three numbers; for a number n of numbers, it would’ve been an n-th of their sum. Based on this definition, in order to use the mean as a tool, one must know each number and the total number of numbers (Of course, when the continuous case is considered, additional mathematical tools are used but with the same underlying idea of a “sum” divided by the number of quantities). This implies in the case that those two parameters (each number and the total number of them) cannot precisely be known, the use of mean for average is meaningless or, to be somehow harsh, totally wrong. If one supposes these two parameters are available (by that, I mean the numbers are obtained in some non-controversial way), what one might be able to know from the mean is how some item can evenly be distributed. Therefore, the use of the mean in some situation can be meaningless if one wants to know how this item is actually distributed, which can be very irregular, instead of the ideal case of the mean. An easy example is this: if John got 95 over 100 in his literature exam, which might show his strength in the subject, but got 30 in his math exam, the mean, which is 67.5, seems to be more presentable (at least for the math score) and hides the unpleasant fact that John had failed his math test.

What about the mode? Given a set of quantities, the mode of this set is the most frequently present quantity (the continuous case also keeps the same idea but with additional mathematical tools). For instance, if the set is {a, a, b, c}, then the mode is a. Thus, to obtain the mode of some set, it seems one needs to know the number of times each quantity occurs and be able to compare those numbers. Hiding the mode of a set behind the word “average” might cause confusion since it seems to be very efficient at masking the real image of a collection of data based only on few items. As an example, for a collection of 10 scores where 8 are below 50 points over 100 but mutually different and 2 are 95 points, the average (a.k.a. mode) would then be 95.

Finally, let’s turn to the median. Given a list of quantities (I guess I’ll assume the list to be finite), the median is the “middle” quantity if the total number of quantities is odd or the mean of the adjacent ones in the middle if the total number is even (consider what I’ve said for the continuous cases above) when these quantities are arranged in an increasing order. For example, the median of 1, 2, 3 is 2 while the median of 1, 2, 3, 4 is the mean of 2 and 3, so 2.5. Again, the median says what its definition says it is, assuming, again, those quantities are collected in some unambiguous way. By now, as you may have expected, confusion, unintentionally or not, may arise when using the median since it, as the mode, may not expose the real picture of the data set since to only consider the “middle” quantity does not necessarily say anything about acute variations in both halves in the list.

Going through this, one might realize that the level of ambiguity is so potentially high that it is a rather risky move to use the word “average” in an argument since its cogency can be easily stripped down if one asks about which meaning is assigned to this word (mean, mode, or median). Knowing that, further questions may be asked as to how the quantities are obtained. I sometimes even suspect that more than one meaning might be used in one article, which may suggest that there is some intention to deceive the readers.

Therefore, beyond the definition of each of its meanings, the amount of knowledge one may get from “average” seems to be limited. For example, when someone claims that some particular group of people, on average, earns some amount of money, what is that supposed to mean? Has the number of the people in the group been obtained precisely (I don’t mean here “rough approximation”)? And is it the same for their salaries? If the amount of knowledge fails in those two steps, it seems one need not go any further unless one chooses to disregard this limitation, as I am afraid is the case in many similar situations.

So, are there other notions, directly related to mathematics, which you think might be overused or misused, which nevertheless have been widely accepted as a legitimate way to transmit a supposed knowledge of some phenomenon?

This entry was posted in AMS, General, Math, Mathematics in Society. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

HTML tags are not allowed.