Extraordinary Squares: Cleaning up ownership of some legacy tests

The Statistics team used to be a combined team here at Tableau and was called Statistics and Calculations. Before I started, there was a small change made to make Statistics its own standalone team. All has been good since with one small problem - all of the tests the combined team owned are labelled as being owned by "Statistics and Calculations. When a test fails, the person that notices usually just assigns the test to Statistics (since our name comes first in that string, I guess) even if the functionality being tested is not owned by our team.

An example of a feature we own is Clustering. We wrote that and own all the tests for it. An example of a feature we do not own now that we are a standalone team would be table calculations.

Anyway, we have hundreds of tests that need to be retagged. I decided to take on this work in order to properly tag ownership of the tests. This way, if a test fails, it can be properly routed to the best owner immediately. The first challenge is just getting a list of all the files that I need to edit. The lowly DOS command (DOS? Isn't that going on 40+ years old?) "findstr" was incredibly useful. I just looked through every file in our repository to find the old string "Statistics and Calculations" that I need to edit.

Now I had a list of all the files in a weird DOS syntax. Example:

integration_tests\legacytest\main\db\RegexpMatchFunctionTest.cpp: CPPUNIT_TEST_SUITE_EX( RegexpMatchFunctionRequiredUnitTest, PRIMARY_TEAM( STATISTICS_AND_CALCULATIONS_TEAM ), SECONDARY_TEAM( VIZQL_TEAM ) );

Also notice that the path is incomplete - it just starts with the \integration_tests folder and goes from there.

The actual list of files is well over a hundred and my next task was to clean up this list. I though about hand editing the file using Find and Replace and manually cutting out stuff I did not need, but that would have taken me well over an hour or two. Plus, if I missed a file, I would have to potentially start over, or at least figure out how to restart the process with changes. Instead, I decided to write a little python utility to read through the file, find the (partial) path and filename and remove everything else in each line. Then correct the path and add the command I need to actually make the file editable. Our team uses perforce so this was just adding "p4 edit " to the start of each file. And fixing the path was pretty simple - just prepend the folder name I was in when I ran findstr.

Finally, clean out duplicate file names and run my code. It created a 13K batch file ready for me to get to work, and if I need to update, I can just run my code again. Kind of like reproducible research - at least that is how I think of it.

I can post the code if anyone is interested, but it is pretty basic stuff.

Questions, comments, concerns and criticisms always welcome,

John

Extraordinary Squares

Monday, November 28, 2016

Cleaning up ownership of some legacy tests

No comments:

Post a Comment