Last week I left off
with a test that mimics the user action of re-ordering the criteria you used to
create clusters. The clusters themselves
should not change when this happens, and the test verifies that they do not
change. I got that failure fixed and it
passed 10 times when I ran my test locally.
Why 10 times? I have learned that any test which
manipulates the UI can be flaky.
Although my test avoids the UI here as much as I can, it still has
elements drawn on screen and might have intermittent delays while the OS draws
something, or some random window pops up and steals focus, etc… So I run my test many times in an attempt to
root out sources of instability like these.
I would love to do
more than 10 tests but the challenge becomes the time involved in running one
of these end to end scenarios. There is
a lot of work for the computer to do to run this test. The test framework has to be started (I'm
assuming everything is installed already, but that is not always the case),
Tableau has to be started, a workbook loaded, etc… Then once done, cleanup needs to run, the OS
needs to verify Tableau has actually exited, all logs monitored for failures
and so on. It's not unusual for tests
like this to take several minutes and for sake of argument, let's call it 10
minutes.
Running my test 10
times on my local machine means 100 minutes of running - just over an hour and
a half. That is a lot of time. Running 100 times would mean almost 17 hours
of running. This is actually doable -
just kick off the 100x run before leaving to go home and it should be done the
next morning.
Running more than
that would be ideal. When I say these
tests can be flaky, a 0.1% failure rate is what I am thinking. In theory, a 1000x run would catch this. But that now takes almost a week of run
time. There are some things we can do to
help out here like run in virtual machines and such, but there is also a point
of diminishing returns.
Plus, consider the
random window popping open that steals focus and can cause my test to
fail. This doesn't have anything to do
with clsutering - that works fine, and my test can verify that. This is a broader problem that affects all
tests .There are a couple of things we can do about that which I will cover
next.
Questions, comments, concerns and criticisms always welcome,
John
No comments:
Post a Comment