The Statistics team
used to be a combined team here at Tableau and was called Statistics and
Calculations. Before I started, there
was a small change made to make Statistics its own standalone team. All has been good since with one small
problem - all of the tests the combined team owned are labelled as being owned
by "Statistics and Calculations.
When a test fails, the person that notices usually just assigns the test
to Statistics (since our name comes first in that string, I guess) even if the
functionality being tested is not owned by our team.
An example of a
feature we own is Clustering. We wrote
that and own all the tests for it. An
example of a feature we do not own now that we are a standalone team would be
table calculations.
Anyway, we have
hundreds of tests that need to be retagged.
I decided to take on this work in order to properly tag ownership of the
tests. This way, if a test fails, it can
be properly routed to the best owner immediately. The first challenge is just getting a list of
all the files that I need to edit. The
lowly DOS command (DOS? Isn't that going
on 40+ years old?) "findstr" was incredibly useful. I just looked through every file in our
repository to find the old string "Statistics and Calculations" that
I need to edit.
Now I had a list of
all the files in a weird DOS syntax.
Example:
integration_tests\legacytest\main\db\RegexpMatchFunctionTest.cpp: CPPUNIT_TEST_SUITE_EX(
RegexpMatchFunctionRequiredUnitTest, PRIMARY_TEAM(
STATISTICS_AND_CALCULATIONS_TEAM ), SECONDARY_TEAM( VIZQL_TEAM ) );
Also notice that the
path is incomplete - it just starts with the \integration_tests folder and goes
from there.
The actual list of
files is well over a hundred and my next task was to clean up this list. I though about hand editing the file using
Find and Replace and manually cutting out stuff I did not need, but that would
have taken me well over an hour or two.
Plus, if I missed a file, I would have to potentially start over, or at
least figure out how to restart the process with changes. Instead, I decided to write a little python
utility to read through the file, find the (partial) path and filename and
remove everything else in each line.
Then correct the path and add the command I need to actually make the
file editable. Our team uses perforce so
this was just adding "p4 edit " to the start of each file. And fixing the path was pretty simple - just
prepend the folder name I was in when I ran findstr.
Finally, clean out
duplicate file names and run my code. It
created a 13K batch file ready for me to get to work, and if I need to update,
I can just run my code again. Kind of
like reproducible research
- at least that is how I think of it.
I can post the code
if anyone is interested, but it is
pretty basic stuff.
Questions, comments,
concerns and criticisms always welcome,
John
No comments:
Post a Comment