Extraordinary Squares: Anscombe's Quartet is not especially good for software testing

In statistical circles, there is a set of data known as Anscombe's Quartet. The link goes to Robert

Kosara's blog and he does a good job of showing off the data set and how radically different

looking data can have similar statistical properties.

The design of the set of data was to give a column of data points that all have the same standard

deviation (as one example). For instance, the standard deviation of the first Y column is 1.937024.

But the second Y column is 1.937109. In both cases, I am rounding the numbers to the 6th

decimal place.

The difference in the values is1.937109 - 1.937024= 0.000085. Now, this is very close and to a

human eye trying to view a plot of the data is probably too small to be seen. For making a point

about the data - that similar statistical properties cannot be used to determine the shape of

a data set - this is good enough. But computers have enough precision that a difference of

0.000085 is large enough to be detected and the 2 columns of data would be distinctly

differentiable.

As a side note, I did test clustering with the data set just for fun. Folks around here kind of grinned

a bit since this data set is pretty well known and thought it would be fun to see the results.

But as a test case, it really is not very good at all. The challenge here would be to come up with 2

columns of data that had the exact same standard deviation (subject to rounding errors) and use

that to validate tie breaking rules we might have for this condition. One easy way to do this would

be to reverse the signs of the numbers from column 1 when creating column 2. Then make a quick

check that the standard deviation is the same to validate the rounding was the same for both

positive and negative value calculations.

Another way would be to scale the data, but that can result in different rounding behavior.

Even though this data set is not a good test case, the idea behind it is very valid. Ensure that you

test with values that are exactly the same - and keep in mind that you need to validate "exactly the

same" with any given data set.

Since this is such a well known data set, I wanted to share my thoughts on it and why it is not all that

useful for testing computer software. The precision of the computer needs to be taken into account.

Questions, comments, concerns and criticisms always welcome,

John

Extraordinary Squares