What does good test automation training look like?

As I’m moving away from by-the-hour work and more towards a hybrid of consulting, training, writing and speaking, one of the things I’m working on is slowly building up a workshop and training portfolio around topics I think I’m qualified to talk about. I have created a couple of workshops already, but so far, they are centered around a specific tool that I am interested in and enthusiastic about. This, however, has the downside that they’re probably targeted towards a relatively small group of interested people (not in the least because these tools are only available for a specific programming language, i.e., Java).

To extend my options with regards to delivering training and workshops, I am currently looking at developing workshops and training material that contain higher level and more generic material, while still offering practical insights and hands-on exercises. There are a lot of different approaches and possible routes that can be taken to achieve this, especially since there is no specific certification trajectory around test automation (nor do I think there should be, but that’s a wholly different discussion that I’ll probably cover in another blog post in time). So far, I haven’t figured out the ideal contents and delivery format, but ideas have been taking shape in my head recently.

Here are some subjects I think a decent test automation training trajectory should cover:

Test automation 101: the basics
Always a good approach: start with the basics. What is test automation? What is it not (here’s a quote I love from Jim Hazen)? What role does automation play in current development and testing processes and teams? Why is it attracting the interest it does? To what levels and what areas can you apply test automation and what is that test automation pyramid thing you keep hearing about?

Test automation implementation
So, now that you know what test automation (sorta kinda) is, how to apply it to your software development process? How are you going to involve stakeholders? What information or knowledge do you want to derive from test automation? How does it fit into trends such as Agile software development, BDD, Continuous Integration and Continuous Delivery?

Test automation, the good the bad and the ugly
It’s time to talk about patterns. Not about best practices, though, I don’t like that term. But there are definitely lessons to be learned from the past on what works and what doesn’t. Think data driven. Think maintainability. Think code review. Think (or rather, forget) code-free test automation. Think reporting. Think some more.

Beyond functional test automation: what else could automation be used for?
Most of what we’ve seen so far covers functional test automation: automated checks that determine whether or not some part of the application under test functions as specified or desired (or both, if you’re lucky). However, there’s a lot more to testing than mere functional checks. Of course there’s performance testing, security testing, usability testing, accessibility testing, all kinds of testing where smart application of tools might help. But there’s more: how about automated parsing of logs generated during an exploratory testing session? Automated test data creation / generation / randomization? Automated report creation? All these are applications of test automation, or better put, automation in testing (thanks, Richard!), and all these are worth learning about.

Note that nowhere in the topics above I am focusing on specific tools. As far as I’m concerned, getting comfortable with one or more tools is one of the very last steps in becoming a good test automation engineer or consultant. I am of the opinion that it’s much more important to answer the ‘why?’ and the ‘what?’ of test automation before focusing on the ‘how?’. Unfortunately, most training offerings I’m seeing focus solely on a specific tool. I myself am quite guilty of doing the same, as I said in the first paragraph of this post.

One thing I’m still struggling with is how to make the attendants do the work. It’s quite easy to present the above subjects as a (series of) lecture(s), but there’s no better way to learn than by doing. Also, I think hosting workshops is much more fun than delivering talks, and there’s no ‘workshop’ without actual ‘work’. But it has to be meaningful, relevant to the subject covered, and if possible, fun..

So, now that I’ve shared my thoughts on what ingredients would make up a decent test automation education, I’d love to hear what you think. What am I missing (I’m pretty sure the list above isn’t complete). Do you think there’s an audience for training as mentioned above? If not, why not? What would you do (or better, what are you doing) differently? This is a topic that’s very dear to me, so I’d love to hear your thoughts on the subject. Your input is, as always, much appreciated.

In the meantime, I’ve started working on a first draft of training sessions and workshops that cover the topics above, and I’m actively looking for opportunities to deliver these, be it at a conference or somewhere in-house. I’ve got a couple of interesting opportunities lined up already, which is why I’m looking forward to 2017 with great anticipation!

Managing test data in end-to-end test automation

One of the biggest challenges I’m facing in projects I’m contributing to is the proper handling of test data in automated tests, and especially in end-to-end test automation. For unit and integration testing, it is often a good idea to resort to mocking or stubbing the data layer to remain in control over the test data that is used in and required for the tests to be executed. When doing end-to-end tests, however, keeping all required test data in check in an automated manner is no easy task. I say ‘in an automated manner’ here, because once you start to rely on manual intervention for the preparation or cleaning up of test data, then you’re moving away from the ability to test on demand, which is generally not a good thing. If you want your tests to truly run on demand, having to rely on someone (or a third party process) to manage the test data can be a serious bottleneck. Even more so with distributed applications, where teams often do not have enough control over the dependencies they require in order to be able to do end-to-end (or even integration) testing.

In this post, I’d like to consider a number of possible strategies for dealing with test data in end-to-end tests. I’ll take a look at their benefits and their drawbacks to see if there’s one strategy that trumps all others (spoiler alert: probably not…).

Creating test data during test execution
One approach is to start every test, suite or run with a set-up phase where the test data required for that specific test, suite or run is created. This can be done by any technical means available: be it through direct INSERT statements in a database, a series of API calls that create new users, orders or any other type of test data object, or (if there really is no alternative) through the user interface. The main benefit of this approach is that there is a strong coupling between the created test data and the actual test, meaning that the right test data is always available. There’s some rather big drawbacks as well, though:

  • Setting up test data takes additional time, especially when doing through the user interface.
  • Setting up test data requires additional code, which increases the maintenance burden of your automated tests.
  • If an error occurs during the test data setup fase of your test, your actual test result will be unpredictable and therefore cannot be trusted. That is, if your test isn’t simply aborted before the actual test steps are executed at all…
  • This approach potentially requires tests to depend on one another in terms of the sequence in which they’re executed, which is a definite anti-pattern of test automation.

I’ve used this approach several times in my projects, with mixed results. Sometimes it works just fine, sometimes a little less so. The latter is most often the case when the data model is really complex and there’s no other way than mimicking user interaction by means of tools such as Selenium to get the data prepared.

Query test data prior to test execution
The second approach to dealing with test data around automated tests is to query the data before the actual test runs. This can be done either directly on the database, or possibly through an API (or even a user interface) that allows you to retrieve customers, articles or whatever type of data object you need for your test. The main benefit of this approach is that you’re not losing time creating test data when all you really care about is the test results, and that this approach results in less test automation code to maintain, especially when you can query the database directly. Here too, there are a couple of drawbacks that render this approach less than ideal as well:

  • There’s no guarantee that the exact data you require for a test case (especially with edge cases) is actually present in the database. For example, how many customers that have lived in Nowhereville for 17,5 years, together with their wife and their blue parrot, are actually in your database?
  • Sometimes getting the query right so that you’re 100% sure that you get the right test data is a daunting task, requiring very specific knowledge of the system. This might make this approach less than ideal for some teams.
  • Also, even when you get the query exactly right, does that really guarantee that you get results that you’re 100% sure will be a perfect fit for your test case?

Reset the test data state before or after a test run
I think this is potentially the best approach when having to deal with test data in end-to-end tests: either setting the test data database to the exact state by restoring a database backup or cleaning up the test database afterwards. This guarantees that test data-wise, you’re always in the exact same state before / after a test run, which massively improves predictability and repeatability of your tests. The main drawback is that often, this is not an option either due to nobody knowing enough about the data model to allow this, or by access to the database being restricted for reasons good or less than good. Also, when you’re dealing with large databases, doing a database reset or rollback might be a matter of hours, which slows down your feedback loop significantly, rendering it next to useless when your tests are part of a CD pipeline.

Virtualizing the data layer
Nowadays, there are several solutions on the market that allow you to effectively virtualize your data layer for testing purposes. A prime example of such a solution is Delphix, but there are several other tools on the market as well. I haven’t experimented with any of these for long enough to actually have formed an educated opinion, but one thing I don’t really like about this approach is that virtualizing the data layer (however efficient it may be) voids the concept of executing a true end-to-end test, since there’s no actual data layer involved anymore. Then again, for other types of testing, it may actually be a very good concept, just like service virtualization is for simulating the behavior of critical yet hard-to-access dependencies in test environments.

So, what’s your take on this?
In short, I haven’t found the ideal solution yet. I’d love to read about the approaches other people and teams are taking when it comes to managing test data in end-to-end automated tests, so feel free to send me an email, or even better, leave a comment to this post. Looking forward to seeing your replies!

On writing and publishing my first ebook

Sometimes, some of the most interesting things in life happen when you least expect them. Just over half a year ago now (I looked it up, it was on May 11th of this year, to be exact) I received an email from Brian at O’Reilly Media, asking if I was interested in writing a short book on service virtualization. I didn’t have to think long about an answer and replied ‘yes’ the same day. After almost six months, lots of writing, reviewing and editing, many, many emails and a couple of video calls I am very proud to present to you my first ever ebook:

Service virtualization ebook

In this post, I’d like to tell you a little more about the book and about the process of writing and editing such a piece. Even though the book is relatively short (HPE, who’s sponsoring the book, set an upper limit of 25 pages of actual content), we went through much the same process as a full-length book would require, from proposal to production and everything in between.

The book
So, first, let’s take a look at the most important part: the end result. What we were aiming for was to give an overview of the current state of the service virtualization field and how this technique plays a role (or at least can play a role) in current and future IT trends. I won’t summarize the whole book here (it’s short enough so you can read it in about an hour) but if you want to know how service virtualization and Continuous Delivery can work together, or how you can leverage service virtualization when testing Internet of Things-applications, you’re cordially invited to read this book. It’s available free of charge from the HPE website, so why not take a look?

The writing process
After that initial email I received back in May, a lot has been taking place. Writing a piece like this starts with writing a proposal summarizing the prospective book outline, the reason why this book should be written, who is the target audience, and why the person writing the book proposal (i.e., me) thinks he or she is the right person for the job. This proposal is used to convince the sponsor (as I said, HPE, in this case) that they’re investing their money and effort wisely.

When the proposal is accepted, the actual writing starts. This is what takes up most of the time, but I think that goes without saying. We set two deadlines from the start: one date where a draft version of around 50% of the book should be delivered (to gauge whether the writer is on the right track and to keep things moving) and of course a deadline date for the first full draft.

As anybody who has ever written a book knows, once the first full draft is delivered, you’re not there yet. Not even close! An extensive reviewing and editing process has taken place to remove any spelling and grammatical errors, to improve the flow of the book and to make sure that all contents matched the expectations of HPE, of O’Reilly and last but not least of myself. This took a little longer than I initially thought it would, but then again, the end result is so much better than I could have produced on my own, so it has been very well worth the effort.

Would I do it again? You bet I would! I have thoroughly enjoyed the process of proposing, writing, reviewing and editing this book, even though at times it has been hard to review the same piece of text for the umpteenth time. Also, the guys and girls from O’Reilly, who have worked just as hard as I have myself (if not harder) to get this book out there, have been nothing less than fantastic to work with. So, Brian, Virginia, thanks so much, it was awesome working with you and I look forward to doing this again in some way, shape or form in the future. I also learned quite a few interesting things on the English language and editing standards. Since I’m a guy who’s always looking to improve his English skills, this has been quite invaluable too.

So if you’re ever in the position where you’re asked to write a book, or if you’ve ever thought about writing one yourself, I can wholeheartedly recommend going for it. Not only will you have something that you can be proud of once you’re finished, but you’ll learn so many things in the process.

Oh, and again, if you’re interested in a quick read on the current state of service virtualization, you can download the book for free from here. I’d love to hear your thoughts on it.