Monday, August 29, 2016

What we can do, part 2


I've (hopefully) laid out the challenge the test team has with validating models (algorithms) given the constraints of using computers to compute results.  There seem to be some large hurdles to jump and indeed, there are.

On one hand, we could just point to the documentation (IEEE-754) and say "Well, you get the best computers can do."  That only goes so far, though.  The test team still needs to validate the algorithm returns a value that is approximately correct, so we have to define approximately. 

Step 1 is to define a range of values that we expect from an algorithm, rather than a single value.  This is a large step away from "typical" software validation.

For instance, if I send one piece of email with a subject "Lunar position" then I expect to receive one piece of email with the subject "Lunar position".  This is very straightforward and the basis for most test automation.  I expect (exactly) 1 piece of email, the subject needs to be (exactly) "Lunar position" and not "LUNAR POSITION" and the sender name need to be (exactly) the account from which the test email was sent.

What we do on our validations is set an error factor which we call epsilon:  .  This error factor is added to the exact value a pure math algorithm would produce and subtracted from that number as well to give a range of values we expect.  To go back to the average example, we expect a value of exactly 50.  If we set the epsilon to .0001, then we will allow the computer to pass the test if it gives us a number between 50 - .0001 and 50 + .0001.

50  -  =  49.9999 
50 +  = 50.0001
The range of values we would allow to pass would be between 49.9999 and 50.0001. Anything outside of this range would fail.

If the output from a test is 49.99997, we pass the test.  If 50.03 is the result, we would fail the test.

This is pretty simple, with the key challenge being setting a reasonable epsilon.  I'll cover that in the future.

Questions, comments, concerns and criticisms always welcome,
John

No comments:

Post a Comment