Tuesday, August 15, 2017

I'm waiting for this book about using Tableau with Matlab

Last week I wrote a bit about the Matlab support with Tableau.  Our team also owns R support (it is all built on a REST API) and many people have been using R for years with Tableau.  Talks about R integration are a very popular topic at Tableau Conference each year as well, so there has been a tremendous amount of interest in this area.

So much interest, in fact, that Jen Stirrup has written a new book that is due out pretty soon, Advanced Analytics with R and Tableau. It will be available in paperback and ebook formats and is due out on September 6.  That is less than a month away as I write this and I am looking forward to getting my copy.

Good luck, Jen.  I hope you sell many copies of this book!

Questions, comments, concerns and criticisms always welcome,


Monday, August 7, 2017

Matlab and Tableau!

Kudos to the folks at Mathworks for their efforts to bring Matlab into the Tableau world!  Details about this are here.  Nice job, gang!

As for testing, this is one of my team's areas.  We also own R and Python integration, so we had test plans for this area well established.  Mathworks was so good at their implementation that there was frankly not much for us to do - a couple of string change requests and that was really about it.  We added some automated tests to validate the behavior is correct - to tell us if we change something that would break using Matlab server - and have that up and running now.  The side benefit to the automation is that we have no manual testing left from this effort.  This means that we are not slowed down at all in the long term even though we have added new functionality.  From a test point of view, this is the ideal case.

We never want to build up manual test cases over time.  That growth, if there is any, will always eventually add up to more time than the test team has to complete the tasking.  Obviously, this doesn't work in the long term so we have made a concerted effort to hit 100% of our test cases being automated.

So, yay us!

And thanks again to Mathworks. FWIW, I truly like Matlab.  It is every easy to look at some mathematical equation and simply type it into Matlab - it almost always works the very time I try it. 

Questions, comments, concerns and criticisms always welcome,

Thursday, August 3, 2017

Cartographies of Time - a mini-review

We have an internal library here at Tableau and like any library, we can check out books to read or study.  We had the same setup at Microsoft as well, with a heavy emphasis on technical books.  Any computer company will have books on programming habits, design patters, Agile and other fields like this.

Tableau also has a large section on data visualizations.  The whole spectrum is covered here from books on how to efficiently write a graphics routine to how to best present data on screen in human readable form. 

A new book arrived this last week called Cartographies of Time and it is a history of the timeline.  I saw it on the shelf and grabbed it since I am a fan of medieval maps and the cover has a map in that style on it.  It is a fascinating book that covers the very first attempts at timelines and brings us up to the modern day.

The most striking aspect of this so far - I've not gotten too far into the book - is the sheer artistic skill of the early timelines.  The people that created those timelines worked very hard to get a vibrant image, a workable color scheme and a tremendous amount of data all put into one chart.  It is simply amazing to see this and if you have the opportunity I recommend picking up a copy of this book for yourself.

Questions, comments, concerns and criticisms always welcome,

Thursday, July 27, 2017

A brief aside about data at the Tour de France

I'm a bike race fan and I really enjoy watching the stage races like the Tour de France.  The colors, speed and racing is just a great spectacle.

One of the teams that was there this year is Dimension Data.  They use Tableau to analyze the TONS of data they get on the riders and I read, re-read and read again this article on how they do it: https://www.dcrainmaker.com/2017/07/tour-de-france-behind-the-scenes-how-dimension-data-rider-live-tracking-works.html

Now, if I can just get myself invited along on a race to help them with Tableau…

And congratulations to Edvald Boasson Hagen!

Questions, comments, concerns and criticisms always welcome,

Friday, July 21, 2017

Paying down test debt, continued

Last week I mentioned that some old test automation breaks while it is disabled.

As an example, suppose I added a test to check for the 22 Franch regions being labelled properly back in 2014.  It works for a year, but then France announces it will consolidate its regions in January 2016.  While working on that change, I disable my test since I know it won't provide any value while the rest of the changes are in progess.

Then I forget to turn the test back on and don't notice that until after the change.

In this case, the fix is straightforward.  I change my test to account for the real world changes that happened while it was disabled.  In this case, I take out the list of the 22 regions and replace that list with the 14 new ones. 

This pattern - the code being tested changes while the test is disabled - is common.  In almost all cases, simply changing the test to account for the new expected behavior is all that needs to be done to enable the test.  So I typically make that change, enable the test, run it  a few thousand times and if it passes, leave it enabled as part of the build system moving forward.

Sometimes the tests are more complicated that I know how to fix.  In that case, I contact the team that owns the test and hand off the work to enable it to them.

All in all, this is a simple case to handle. 

There is also the case that the test is no longer valid.  Think of a test that validated Tableau worked on Windows Vista.  Vista is no longer around, so that test can simply be deleted.

Other factors can change as well, and I'll wrap this up next week.

Questions, comments, concerns and criticisms always welcome,

Wednesday, July 12, 2017

Paying down test debt

Another aspect of my work recently has been paying down technical debt we built over the years.  An example of technical debt would be this:
  1. Imagine we are building an application that can compute miles per gallon your car gets
  2. We  create the algorithm to compute miles per gallon
    1. We add tests to make sure it works
    2. We ship it
  3. Then we are a hit in the USA!  Yay!
  4. But the rest of the world wants liters per 100 kilometers. 
  5. We add that feature
    1. As we add it, we realize we need to change our existing code that only knows about miles
    2. We figure it will take a week to do this
    3. During this week, we disable the tests that test the code for "miles"
    4. We finish the liters per 100km code
    5. We check in
  6. We ship and the whole world is happy

But look back at step 5c.  The tests for miles (whatever they were) were disabled and we never turned them back on.  We call this "technical debt" or, in this case since we know it is test related, "test debt."  It happens when we take shortcuts like 5c - disabling a test.  I'll just point out a better practice would have been to ensure every bit of new code we wrote for the metric values should never have broken the MPG code, and the test should never have been disabled.  In the real world, the most likely reason to do this would be for speed - I simply want to test my new code quickly and don't want to run all the tests over the old code I am not changing, so I disable the old tests for right now and will re-enable them when I am done.  (Or so I say...)

So one other task I have taken on is identifying these tests that are in this state.  Fortunately, there are not many of them but every so often they slip through the process and wind up being disabled for far longer than what we anticipated.  Turning them back on is usually easy.  Every so often, an older test won't pass nowadays because so much code has changed while it was disabled.

What to do in those cases is a little trickier and I will cover that next.

Questions, comments, concerns and criticisms always welcome,

Thursday, July 6, 2017

Using the tool I wrote last week to start making changes

I finished my tool to look through a large set of our test code to classify our tests with respect to who owns them, when they run and other attributes like that.  My first use of this was to find "dead" tests - tests that never run, provide no validation or otherwise are left in the system for some reason.  I want to give a sense of scale for how big this type of challenge is.

After looking through just over 1000 tests, I identified 15 that appeared they may be dead.  Closer examination of those tests took about 1/2 a day and determined that 8 of them are actually in use.  This revealed a hole in my tool - there was an attribute I forgot to check.

One of the tests was actually valid and had simply been mis-tagged.  I reenabled that test and it is now running again and providing validation that nothing has broken.

The other 6 tests were a bit more challenging.  I had to look at each test then look at lab results to see if anyone was actually still running them, dig through each test to see what was the expected result and so on.  In most cases, I had to go to the person that wrote the test - in 2 instances, almost 10 years ago - to see if the tests could be removed.  It might seem trivial to track 6 files out of 1000+ but this will save us build time for every build and maintenance costs over the years as well as leaving a slightly cleaner test code base.

In 4 of the cases, the tests can be removed and I have removed them.  In the USA, this is a holiday week for us so I am waiting on some folks to get back in the office next week to follow up on the last 2 tests. 

This is all incremental steps to squaring away our test code.

Questions, comments, criticisms and complaints always welcome,