Extraordinary Squares: May 2018

Friday, May 25, 2018

I had not heard the term "rolling log" before but now Tabpy has one

From the "you learn something new every day" category, I saw that we had a proposed enhancement to Tabpy to support a rolling log. I had not heard that term before but the concept is fairly simple.

Suppose your application creates a log whenever it gets started. That log is named "My Application Log.txt" and is saved when the application exits. Now you have a dilemma. Suppose I start the application, run it for a bit then exit. I have my log file. Then I start it again.

What do I do with the previous log file?

Some obvious options:

I could have the application delete it and start a new one.
I could just keep adding to the existing log
I could create a new log with a different name

Each one of these has pros and cons. Idea 1 is simple - easy to understand, but you lose the history of what had happened last time the application ran. That information is often useful if you need to troubleshoot

2 and 3 both keep all that history, but the hard drive will eventually get full.

Option 4 is the Rolling Log. It is like 3, but deletes all except the last few logs. I can set it to keep the 5 most recent, 2 most recent, 10 most recent, etc… Then I retain some amount of history but avoid filling up the hard drive over time.

Check out the proposed change we want to take for Tabpy over at https://github.com/tableau/TabPy/. The pull request is the current request as I write this but that will change over time.

Questions, comments, concerns and criticisms always welcome,

John

Friday, May 18, 2018

How often we ship Tableau, a test perspective

If you look at Tableau closely, it's obvious - and we make the claim - we ship a new version of Tableau every quarter. But from the test point of view, we ship much more often. Here's what is looks like from our point of view.

Imagine you own a textbook publishing company. Each quarter, you release a new textbook covering some new topic you have not covered before. An example might be Astronomy: The Mountains of Mars from Winter 2017 and Egyptian History out in Spring 2018.

At the same time you are ready to ship the Egyptian History book, though, the Mars explorer finds enough data to cause you to need to update one of the chapters of the Mars Mountain book. So for Spring 2018, you have 2 books you need to send out the door: the new Egypt book and an updated version of the Mars book.

Your proofreaders will need to focus most of their time on the new book but still must devote some amount of time to validating the text and layout of the updated chapter of the Mars book. Additionally, the new chapter might change the page count of the Mars book. If so, you might need to bind the book differently. If there are new photos, you may want to update the cover or back of the book. The table of contents might change, and the index will likely need to be updated.

A test case for the index might be to validate the previous contents are intact after the new chapter index is inserted. If the size of the index no longer fits on the current set of pages, you will need to rebind the book, shrink the index or otherwise resolve this dilemma. And the proofreaders, who might have naively thought they needed to verify only the new chapter contents, potentially have to validate the entire Mars book.

Testing is in the same position. While our focus is typically on the new versions of Tableau we release every quarter, we also continue to support the last few years' worth of releases. That means in addition to testing the "major" release of Tableau, we have to test the updates we consistently ship as well. So from my point of view, always have multiple releases we have to validate. And that means that we ship far more often than once per quarter.

Questions, comments, concerns and criticisms always welcome,

John

Friday, May 11, 2018

Constantly learning

One of the requirements that comes along with working in the tech industry (or any industry, really) is to adopt a notion of constant learning. A new computer science major today will know more than those that graduated ten years ago, and will know less than students hired ten years from now.

In order to stay current with the industry, I have found several opportunities to keep learning. One of my favorites is online classes (MOOCs, or massive open online courses). The University of Washington has a Data Science program certificate class starting soon and it looks like there are enough of us around here to develop a study group. For many folks, having that level of interaction is a necessity for getting the most out of the class. The environment it creates - a useful forum for discussing the techniques being taught - really helps cement the lesson at hand.

I'm not sure what the emphasis of this class will be, though. I hope it is more along the lines of implementing a few ML routines as opposed to using "off the shelf" solutions (which are never 100% complete - you always need to write code at some point) but it is definitely on my radar. Let me know if you sign up for any classes in this series (the audit track is free) and maybe we can "attend" together!

Questions, comments, concerns and criticisms always welcome,

John

Monday, May 7, 2018

Lots of Tabpy activity

Looking back over the last few weeks I have had a ton of meetings. One of the products my team owns is Tabpy and I have spent a good amount of time over there.

Specifically, I have been performing some code reviews coming in from the community (thanks gang!) and even had a call with one of the developers. I also checked in some documentation changes over there to update some error messages to help make setup errors a bit more clear.

Also on the Tabpy front, we have had some team wide customer calls about how companies are using Python integration and how we can help them meet their goals. Behind the scenes, we are taking notes, designing stories and entering a set of items to track this work in our backlog. And yes, we are working on these items already, but I (obviously) can't share specifics. That is simply a frustrating aspect of blogging about testing.

Questions, comments, concerns and criticisms always welcome,

John