Thursday, September 8, 2016

My simple epsilon guideline


The general rule I want to follow for validation is that for big numbers, a big epsilon is OK.  For small numbers, small epsilon is desirable.In other words, if we are looking at interstellar distances, and error of a few kilometers is probably acceptable, but for microscopic measuresments, a few micrometers may be more appropriate.

So my rule - open to any interpretation - is "8 bits past the most precise point in the test data."  Let's look at a case where we want a small epsilon - for example, we are dealing with precise decimal values.

Suppose we have these data points for out test case:
1.1
2.7
8.003

The most precise data point is that last one - 8.003.  The epsilon factor will be based off that.

8 bits of precision means 1 / (2^8) or 1/256, which is approximately 1/256=0.0039.  Let's call that .004.   Append this to the precision of the last digit of 8.003, which is the one thousandth place.  I get 0.000004.  This means anything that is within .000004 of the exact answer will be considered correct.

So if we need to average those three numbers:
1.1 + 2.7 + 8.003 = 11.803
11.803/3=3.9343 33333….

The exact answer for the average is impossible to compute accurately in this case.  I still need to verify we get close to the answer, so my routine to validate the result will look for the average to be in this range:
3.934333 - 0.000004 = 3.934329
3.934333 + 0.000004 = 3.934337

So if the average we compute is between  3.934329  and 3.934337 .

More on this, and how it can be implemented to enforce even greater accuracy will come up later.

Questions, comments, concerns and criticisms always welcome,
John






No comments:

Post a Comment