Tuesday, May 23, 2017

Sharing lessons from moving test code around


I mentioned 2 weeks ago that I was moving some tests around within our codebase.  That work is still happening and will almost certainly continue for quite some time.

One other task I am taking simultaneously is quantifying the cost of moving these tests.  This ranges from a simple hourly track of my time to including the time others need to review the code and validating the tests achieve the same coverage once they have been moved.

I'm also taking a stab at quantifying how difficult moving a test can potentially be.  For instance, a traditional unit test that happens to be in a less than ideal location is a good candidate for almost a pure "copy and paste" type of move.  Since the test is sharply focused and doesn't have many dependencies, it is very simple to move around.

Other tests that start with loading a workbook in order to validate a column is being drawn correctly (I am making up an example) have many dependencies that have to be untangled before the test can be moved.  This is at best a medium difficulty task and can easily take a large amount of time depending on how tightly both the test and product code are woven together.

For now, I am making notes on how to untie those knots and moving the tests that are easy to move.  Once I am done with my notes, I intend to look through them for common patterns, good starting points and use that data to develop a plan to start untangling the next round of tests.   And of course I will share this with others since I doubt I will have enough time - or energy :) - to do all this myself.

Questions, comments, concerns and criticisms always welcome,
John

Monday, May 15, 2017

All Hands Week


This is a bit of an unusual week.  We have booked the Washington State Convention Center in downtown Seattle for our annual company meeting.  "All Hands" is the navy phrase that we use to show that the entire company attends - we go over business strategy, technical planning, development specific tasking, TableauConference planning and so on.

I can't write much about any of this (maybe not so) obviously.  This will be my second such event and I learned a lot of information last year.  Now that I know where to focus, I expect this year to be even better!

Otherwise, I am still moving unit tests to better locations.  The easy tests to move will likely fill this week for me and then next week the work gets more challenging.  Stay tuned!

Questions, comments, concerns and criticisms always welcome,
John

Wednesday, May 10, 2017

Moving unit tests to better locations


Last week I spent identifying and removing dead code.  For what it is worth, the biggest challenge there is proving the code is not actually used.  If you know of a way to tell if an operator overload is actually called, let me know…

This week I am focused on moving some of our unit tests to a more proper location.  Some of our older tests are part of a large module that runs tests all across the product.  For instance, suppose I want to test Kmeans clustering.  As it stands right now, I either have to work some command line magic to get just those tests to run, or I run that entire module which tests areas in which I am not interested (like importing from Excel).

A better place for the Kmeans test would be in the same module that holds the Kmeans code.  That way, when I run the tests, I focus only on testing the code in which I am interested and don't need to worry about Excel importing.  There are also some speed benefits when building the test code.  Right now, the old project has references all over the product.  It has to have those wide ranging references since it has such a wide variety of tests in it.  That means a lot of file copying and such during compile time.

One of the other benefits I expect to see when done is that the time to build that older test file will shrink because I am removing code from it.  As I move the code to the correct module, I am updating all the references it used to minimize the amount of references needed to build.  So my module build time will go up, but not as much as the time saved from the older test pass.

There is one final benefit to all of this.  When I build my module now, I build both that module and the older test code.  This is necessary since I need the testing provided there in order to test any changes being made in my module.  Once I am done with this task, I will only need to build my module in order to test it since all the tests will be part of the module.  I will no longer have to "pay the price" of  building that older test project.

Questions, comments, concerns and criticisms always welcome,
John

Monday, May 1, 2017

Tabpy on a Pi Tablet !


I built a Raspberry Pi powered tablet last week and brought it in to work.  Naturally, I couldn't resist the near alliteration of "tabpy on a pi-tab" so I pip installed tabpy:

Running tabpy on a Raspberry Pi Tablet


A Pi is pretty low powered so it won't run fast, but should be fun to play with.

Questions, comments, criticisms and complaints always welcome,
John

Wednesday, April 26, 2017

Removing Dead Code


Last week I was working on code coverage.  One of the results I saw is that some of the source code we own is not used by any of our testing - it is 0% covered.  Digging into this, I found out that this code is not used at all by Tableau so I intend to start removing it from our codebase.

Code that is not used by the product you are working on is often referred to as  "dead code" and there are a few ways this can happen.  One obvious way is that existing code functionality simply gets provided by something else.  Let's say you had a very slow Bubble Sort routine to sort a list.  Once you learn a faster algorithm to sort, like Quick Sort, you start using the Quick Sort instead.  If you are not diligent when making the code change to use Quick Sort, the Bubble Sort code can get left behind.  It is not used at this point and becomes "dead code."

Similarly, if you need to implement sorting, you could try Quick Sort, Insertion Sort and Merge Sort (for instance).  Once you profile the time spent by each routine and generate the test data, you can make a decision about which routine to use.  Again, if you don't remove the routines you don't use, they become "dead code." 

After digging into the code coverage numbers, I found a few instances of the second case.  Since this code is not used at all, it doesn't get compiled so that helps mitigate having it around.  But it still results in a good amount of unneeded code stored on everyone's hard drive, maintained in our repository and so on.  The best practice is to just get rid of it and that is what I am working on now.

Questions, comments, concerns and criticisms always welcome,
John

Wednesday, April 19, 2017

Working on a code coverage task this week


For this week I am focused on getting code coverage numbers for our team.

Challenge #1 is pretty simple to state - get a list of all the source files that our team owns.  And while the problem is easy to understand, the real world implications of this are a bit trickier, especially with older files we have.  As the years have gone by, ownership of unchanging source files gets a little fuzzy.  The team (or developer) who created the original source file may be long gone.  Even teams that own a file might have been reorganized - several times over - since the file was checked in. 

So if Carol created the original "stats.cpp" file she may be the only person to ever have edited it.  If she moves to another team, and her old team gets reorganized, ownership can get moved to the bottom of the list of problems to address at that point.  After all, if the code is stable, why spend resources on tracking who should be associated with it? 

But after a while, every company has this challenge.  That is what I am sorting out this week.

Fortunately, Tableau has been pretty good with naming conventions for source files.  For example, all Cluster related files have the text "cluster" in them.  For most of the features my team owns, I can simply search by file name to get a good starting point for the files we own.  Getting that list together, parsed and cleaned up is my goal for the week.

After that, I think I may need to move to class based ownership.  More on that next time.


Questions, comments, concerns and criticisms always welcome,
John

Tuesday, April 11, 2017

A nifty site to see what a compiler does to C++ code


While investigating a very intermittent unit test failure this week, I noticed an anomaly in our code.  This test is written in c++ and had an extra semicolon in it:

We had a test that did something like this:
Worksheet wb = wbc->GetSheet();
    ;
 CPPUNIT_ASSERT(blah blah);

Notice that extra ; in the middle?  Since the test intermittently fails, I was looking for anything unexpected in the code.  This is unexpected, but I also needed to know if it was important.

Matt Godbolt created a terrific site that lets you put in C++ code and see what output various compilers produce.  The site is here  https://gcc.godbolt.org/

You can choose different compilers and I just took a look at gcc 6.3 to see if it would ignore an extra ;. 

Here's my test code:
void test()
{
int x=1;
;
}

And here is the output:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], 1
        nop
        pop     rbp
        ret

I get the same output with or without the extra semicolon. This is great since I would expect the compiler to get rid of blank commands like this.  Since the compiler does indeed ignore this typo in the code, I can move on to other avenues of investigation.

Give this site a whirl.  You can choose several different compilers and chip options, pass parameters in and so on.  Thanks Matt!

Questions, comments, concerns and criticisms always welcome,
John