I received a comment
from a reader last week (I have readers?
Wow. I did very little
"advertising"). Anyway, this
person was a bit confused about modeling averages. I can understand that - we all learn how to
compute averages in junior high or so, and unless we took an advanced math
class, we never were exposed to why the algorithm works or whether there are
alternatives to the way we learned (add 'em all up and divide by the number of
things you added).
I figured a little
background reading might help. Nothing
too deep - I want to keep this blog away from research math and more accessible
to everyone.
Averages are taken
for granted nowadays but that has certainly not been the case always. In fact, in some ways, they were
controversial when first introduced. And
even "when they were first introduced" is a tough question to
answer. http://www.york.ac.uk/depts/maths/histstat/eisenhart.pdf
is a good starting point for digging into that aspect of averages.
The controversial
part is pretty easy to understand and we even joke about it a bit today. "How many children on average does a
family have?" is the standard question which leads to answers like 2.5. Obviously, there are no "half kids"
running around anywhere, and we tend to laugh off these silly results. Coincidentally, the US believes the ideal
number of kids is 2.9. The
controversy came in initially - what value is an average if there is no actual,
real world instance of a result having this value? In other words, what use would it to be to
know that the average family has 2.5 children, yet no families have 2.5
children?
The controversy here
was directed at the people that computed averages. They came up with a number - 2.5 in this
example - that is impossible to have in the real world. And if you try to tell me your algorithm is a
good algorithm yet it gives me impossible results, then I have to doubt it is
"good."
(We will come back
to the standard phrase "All models are wrong. Some are useful." later).
Floating point math
is a more difficult concept to cover.
I don't want to get
into the details of why this is happening in this blog since there is a huge
amount of writing on this already. If
you want details I found a couple of good starting points.
A reasonable
introduction to this is on wikipedia:
A more hands on,
technical overview:
Questions, comments,
concerns and criticisms always welcome,
John
No comments:
Post a Comment