Why I think unit testing is the basis of any solid automation strategy

In a recent blog post I talked about why and how I still use the test automation pyramid as a model to talk about different levels of test automation and how to combine them into an automation strategy that fits your needs. In this blog post I’d like to talk about the basis of the pyramid a little more: unit tests and unit testing. There’s a reason -or better, there are a number of reasons- why unit testing forms the basis of any solid automation strategy, and why it’s depicted as the broadest layer in the pyramid.

Unit tests are fast
Even though end-to-end testing using tools like Selenium is the first thing a lot of people think about when they hear the term ‘test automation’, Selenium tests are actually the hardest and most time-intensive to write, run and maintain. Unit tests, on the other hand, can be written fast, both in absolute time it takes to write unit test code as well as relative to the progress of the software development process. A very good example of the latter is the practice of Test Driven Development (TDD), where tests are written before the actual production code is created.

Unit tests are also fast to run. Their run time is typically in the milliseconds range, where integration and end-to-end tests take seconds or even minutes, depending on your test and their scope. This means that a solid set of unit tests will give you feedback on specific aspects of your application quality much faster than those other types of tests. I stressed ‘specific aspects’, because while unit tests can cover ground in relatively little time, there’s only so much they can do. As goes for automation as a whole.

Unit tests require (and enforce) code testability
Any developer can tell you that the better structured code is, the easier it is to isolate specific classes and methods and write unit tests for them, mocking away all dependencies that method or class requires. This is referred to as highly testable code. I’ve worked in projects where people were stuck with badly testable code and have seen the consequences. I’ve facilitated two day test automation hackathon where the end goal was to write a single unit test and integrate it into the Continuous Integration pipeline. Writing the test took ten minutes. Untangling the existing code so that the unit test could be written? Two days MINUS ten minutes.

This is where practices like TDD can help. When you’ve got your tests in place before the production code that lets the tests pass is written, the risk of that production code becoming untestable spaghetti code is far lower. And having testable code is a massive help with the next reason why unit testing should be the basis of your automation efforts.

Unit tests prevent outside in test automation (hopefully)
If you’re code is testable, it means that it’s far easier to write unit tests for it. Which in turn means that the likelihood that unit tests are actually written increases as well. And where unit tests are written consistently and visibly, the risk that everything and its mother it tested through the user interface (a phenomenon I’ve seen referred to as ‘outside-in test automation’) is far less high. Just writing lots of unit tests is not enough, though, their scope, intent and coverage should be clear to the team as well (so, testers, get involved!).

Unit tests are a safety net for code refactoring
Let’s face it: your production code isn’t going to live unchanged forever (although I’ve heard about lines of COBOL that are busy defying this). Changes to the application, renewed libraries or insights, all of these will in time be reason to refactor your existing code to improve effectivity, readability, maintainability or just to keep things running. This is where a decent set of unit tests helps a lot, since they can be used as a safety net that can give you feedback about the consequences of your refactoring efforts on overall application functionality. And even more importantly, they do this quickly. Developers are humans, and will move on to different tasks if they need to wait hours for feedback. With unit tests, that feedback arrives in seconds, keeping them and you both focused and on the right track.

In the end, unit tests can, will and need not replace integration and end-to-end tests, of course. There’s a reason all of them are featured in the test automation pyramid. But when you’re trying to create or improve your test automation strategy, I’d advise you to start with the basis and get your unit testing in place.

By the way, for those of you reading this on the publication date, I’d like to mention that I’ll be co-hosting a webinar with the folks at Testim, where I’ll be talking about the importance of unit testing, as well as much more with regards to test automation strategy. I hope to see you there! If you’re reading this at a later date, I’ll add a link to the recording as soon as it’s available.

Why and how I still use the test automation pyramid

Last week, while delivering part one of a two-evening API testing course, I found myself explaining the benefits of writing automated tests at the API level using the test automation pyramid. That in itself probably isn’t too noteworthy, but what immediately struck me as odd is that I found myself apologizing to the participants that I used a model that has received so many criticism as the pyramid.

Odd, because

  1. Half of the participants hadn’t even heard of the test automation pyramid before
  2. The pyramid, as a model, to me is still a very useful way for me to explain a number of concepts and good practices related to test automation.

#1 is a problem that should be tackled by better education around software testing and test automation, I think, but that’s not what I wanted to talk about in this blog post. No, what I would like to show is that, at least to me, the test automation pyramid is still a valuable model when explaining and teaching test automation, as long as it’s used in the right context.

The version of the test automation pyramid I tend to use in my talks

The basis of what makes the pyramid a useful concept to me is the following distinction:

It is a model, not a guideline.

A guideline is something that’s (claiming to be) correct, under certain circumstances. A model, as the statistician George Box said, is always wrong, but some models are useful. To me, this applies perfectly to the test automation pyramid:

There’s more to automation than meets the UI
The test automation pyramid, as a model, helps me explain to less experienced engineers that there’s more to test automation than end-to-end tests (those often driven through the user interface). I explain this often using examples from real life projects, where we chose to do a couple of end-to-end tests to verify that customers could complete a specific sequence of actions, combined with a more extensive set of API tests to verify business logic at a lower level, and why this was a much more effective approach than testing everything through the UI.

Unit testing is the foundation
The pyramid, as a model, perfectly supports my belief that a solid unit testing strategy is the basis for any successful, significantly-sized test automation effort. Anything that can be covered in unit tests should not have to be covered again in higher level tests, i.e., at the integration/API or even at the end-to-end level.

E2E and UI tests are two different concepts
The pyramid, as a model, helps me explain the difference between end-to-end tests, where the application as a whole is exercised from top (often the UI) to bottom (often a database), and user interface tests. The latter may be end-to-end tests, but unbeknownst to surprisingly many people you can write unit tests for your user interface just as well.There’s a reason the top layer of the pyramid that I use (together with many others) says ‘E2E’, not ‘UI’…

Don’t try to enforce ratios between test automation scope levels
The pyramid, when used as a guideline, can lead to less than optimal test automation decisions. This mainly applies to the ratio between the number of tests in each of the E2E, integration and unit categories. Even though well though through automation suites will naturally steer towards a ratio of more unit tests than integration tests and more integration tests than E2E tests, it should never be forced to do so. I’ve even seen some people, which unfortunately were the ones in charge, make decisions on what and how to automate based on ratios. Some even went as far as saying ‘X % of our automated tests HAVE TO be unit tests’. Personally, I’d rather go for the ratio that delivers in terms of effectiveness and time needed to write and maintain the tests instead.

Test automation is only part of the testing story
‘My’ version of the test automation pyramid (or at least the version I use in my presentations) prominently features what I call exploratory testing. This helps remind me to tell those that are listening that there’s more to testing than automation. I usually call this part of the testing story ‘exploratory testing’, because this is the part where humans explore and evaluate the application under test to inform themselves and others about aspects of its quality. This is what’s often referred to as ‘manual testing’, but I don’t like that term.

As you can see, to me, the test automation pyramid is still a very valuable model (and still a useless guideline) when it comes to me explaining my thoughts on automation, despite all the criticism it has received over the years. I hope I never find myself apologizing for using it again in the future..

On including automation in your Definition of Done

Working with different teams in different organizations means that I’m regularly faced with the question of whether and how to include automation in the Definition of Done (DoD) that is used in Agile software development. I’m not an Agilist myself per se (I’ve seen too many teams get lost in overly long discussions on story points and sticky notes), but I DO like to help people and teams struggling with the place of automation in their sprints. As for the ‘whether’ question: yes, I definitely think that automation should be included in any DoD. The answer to the ‘how’ of including, a question that could be rephrased as the ‘what’ to include, is a little more nuanced.

For starters, I’m not too keen on rigid DoD statements like

  • All scenarios that are executed during testing and that can be automated, should be automated
  • All code should be under 100% unit test coverage
  • All automated tests should pass at least three consecutive times, except on Mondays, when they should pass four times.

OK, I haven’t actually seen that last one, but you get my point. Stories change from sprint to sprint. Impact on production code, be it new code that needs to be written, existing code that needs to be updated or refactored or old code that needs to be removed (my personal favorite) will change from story to story, from sprint to sprint. Then why keep statements regarding your automated tests as rigid as the above examples? Doesn’t make sense to me.

I’d rather see something like:

Creation of automated tests is considered and discussed for every story and their overarching epic and applied where deemed valuable. Existing automated tests are updated where necessary, and removed if redundant.

You might be thinking ‘but this cannot be measured, how do we know we’re doing it right?’. That’s a very good question, and one that I do not have a definitive answer for myself, at least not yet. But I am of the opinion that knowing where to apply automation, and more importantly, where to refrain from automation, is more of an art than a science. I am open to suggestions for metrics and alternative opinions, of course, so if you’ve got something to say, please do.

Having said that, one metric that you might consider when deciding whether or not to automate a given test or set of tests is whether or not your technical debt increases or decreases. The following consideration might be a bit rough, but bear with me. I’m sort of thinking out loud here. On the one hand, given that a test is valuable, having it automated will shorten the feedback loop and decrease technical debt. However, automating a test takes time in itself and increases the size of the code base to be maintained. Choosing which tests to automate is about finding the right balance with regards to technical debt. And since the optimum will likely be different from one user story to the next, I don’t think it makes much sense to put overly generalizing statements with regards to what should be automated in a DoD. Instead, for every story, ask yourself

Are we decreasing or increasing our technical debt when we automate tests for this story? What’s the optimum way of automating tests for this story?

The outcome might be to create a lot of automated tests, but it might also be to not automate anything at all. Again, all depending on the story and its contents.

Another take on the question whether or not to include automated test creation in your DoD might be to discern between the different scope levels of tests:

  • Creating unit tests for the code that implements your user story will often be a good idea. They’re relatively cheap to write, they run fast and thereby, they’re giving you fast feedback on the quality of your code. More importantly, unit tests act as the primary safety net for future development and refactoring efforts. And I don’t know about you, but when I undertake something new, I’d like to have a safety net just in case. Much like in a circus. I’m deliberately refraining from stating that both circuses and Agile teams also tend to feature a not insignificant number of clowns, so forget I said that.
  • You’ll probably also want to automate a significant portion of your integration tests. These tests, for example executed at the API level, can be harder to perform manually and are relatively cheap to automate with the right tools. They’re also my personal favorite type of automated tests, because they’re at the optimum point between scope and feedback loop length. It might be harder to write integration tests when the component you’re integrating with is outside of your team’s control, or does not yet exist. In that case, simulation might need to be created, which requires additional effort that might not be perceived as directly contributing to the sprint. This should be taken into account when it comes to adding automated integration tests to your DoD.
  • Finally, there’s the end-to-end tests. In my opinion, adding the creation of this type of tests to your DoD should be considered very carefully. They take a lot of time to automate (even with an existing foundation), they often use the part of the application that is most likely to change in upcoming sprints (the UI), and they contribute the least to shortening the feedback loop.

The ratio between tests that can be automated and tests for which it make sense to be automated in sprint can be depicted as follows. Familiar picture?

Should you include automated tests in your Definition of Done?

Please note that like the original pyramid, this is a model, not a guideline. Feel free to apply it, alter it or forget it.

Jumping back to the ‘whether’ of including automation in your DoD, the answer is still a ‘yes’. As can be concluded from what I’ve talked about here, it’s more of a ‘yes, automation should have been considered and applied where it provides direct value to the team for the sprint or the upcoming couple of sprints’ rather than ‘yes, all possible scenarios that we’ve executed and that can be automated should have been automated in the sprint’. I’d love to hear how other teams have made automation a part of their DoD, so feel free to leave a comment.

And for those of you who’d like to see someone else’s take on this question, I highly recommend watching this talk by Angie Jones from the 2017 Quality Jam conference: