Thursday, August 11, 2016

Creating the model


Last time I gave an example of a model to compute an average.  The method I used to compute the average was probably the most familiar method there is - basically, add up all the number and divide by the count of the numbers that were added.

This works well on paper but can lead to problems on computers.  One error that can creep in is adding up to a number bigger than the computer can handle.  Humans with pencil and paper can always make numbers bigger (just add more paper) but a computer will have the maximum number it can handle.

32 bit computers are still popular and for this next bit let's assume the largest integer it can handle is 4,294,967,294. This is the number 2 raised to the 32nd power (2 because the machine is binary, and 32 from the 32 bit processor).  Four billion is a large number, but if we are trying to figure out average sales for a really large business, it won't work. When we try to add numbers that total to more than 4 billion, the computer will not know what to do - it can't count that high.

This isn't new: this is a very old problem with computers.  No matter how you do it, there is a number that will be bigger than the computer can handle.  Just fill up all the computer memory with whatever number you want (all 1s, since it is binary) and then add 1 more to it.  The computer doesn't have enough memory to hold that new number. 

So how can we deal with this potential problem?

One way is to change our algorithm.  Our original algorithm is this:
  1. Add all the numbers in the original list
  2. Divide by the count of the numbers in the original list
  3. The answer I get is the result I want

But we can crash at step 1.

One workaround is to look really hard at what the algorithm does.  Our example had 5 grades, so we add them up and divide the total by 5.  What if we divided each number by 5 to begin with, and then added up those results?

Example:
88/5=17.6
91/5=18.2
83/5=16.6
88/5=17.6 
85/5=17

And 17.6+18.2+16.6+17.6+17=87   This was the output of the original algorithm, so this new algorithm looks like it might be useful.

So the new algorithm would be:
  1. Count the number of items in the list
  2. Divide the first number by that count and add the result to a running total
  3. Divide the next number in the list by that count and add the result to the running total
  4. Repeat step 3 until all the items in the list have been processed

Seems reasonable and next up we will give this a try.

Questions, comments, concerns and criticisms always welcome,
John

No comments:

Post a Comment