Why and how I still use the test automation pyramid

Last week, while delivering part one of a two-evening API testing course, I found myself explaining the benefits of writing automated tests at the API level using the test automation pyramid. That in itself probably isn’t too noteworthy, but what immediately struck me as odd is that I found myself apologizing to the participants that I used a model that has received so many criticism as the pyramid.

Odd, because

  1. Half of the participants hadn’t even heard of the test automation pyramid before
  2. The pyramid, as a model, to me is still a very useful way for me to explain a number of concepts and good practices related to test automation.

#1 is a problem that should be tackled by better education around software testing and test automation, I think, but that’s not what I wanted to talk about in this blog post. No, what I would like to show is that, at least to me, the test automation pyramid is still a valuable model when explaining and teaching test automation, as long as it’s used in the right context.

The version of the test automation pyramid I tend to use in my talks

The basis of what makes the pyramid a useful concept to me is the following distinction:

It is a model, not a guideline.

A guideline is something that’s (claiming to be) correct, under certain circumstances. A model, as the statistician George Box said, is always wrong, but some models are useful. To me, this applies perfectly to the test automation pyramid:

There’s more to automation than meets the UI
The test automation pyramid, as a model, helps me explain to less experienced engineers that there’s more to test automation than end-to-end tests (those often driven through the user interface). I explain this often using examples from real life projects, where we chose to do a couple of end-to-end tests to verify that customers could complete a specific sequence of actions, combined with a more extensive set of API tests to verify business logic at a lower level, and why this was a much more effective approach than testing everything through the UI.

Unit testing is the foundation
The pyramid, as a model, perfectly supports my belief that a solid unit testing strategy is the basis for any successful, significantly-sized test automation effort. Anything that can be covered in unit tests should not have to be covered again in higher level tests, i.e., at the integration/API or even at the end-to-end level.

E2E and UI tests are two different concepts
The pyramid, as a model, helps me explain the difference between end-to-end tests, where the application as a whole is exercised from top (often the UI) to bottom (often a database), and user interface tests. The latter may be end-to-end tests, but unbeknownst to surprisingly many people you can write unit tests for your user interface just as well.There’s a reason the top layer of the pyramid that I use (together with many others) says ‘E2E’, not ‘UI’…

Don’t try to enforce ratios between test automation scope levels
The pyramid, when used as a guideline, can lead to less than optimal test automation decisions. This mainly applies to the ratio between the number of tests in each of the E2E, integration and unit categories. Even though well though through automation suites will naturally steer towards a ratio of more unit tests than integration tests and more integration tests than E2E tests, it should never be forced to do so. I’ve even seen some people, which unfortunately were the ones in charge, make decisions on what and how to automate based on ratios. Some even went as far as saying ‘X % of our automated tests HAVE TO be unit tests’. Personally, I’d rather go for the ratio that delivers in terms of effectiveness and time needed to write and maintain the tests instead.

Test automation is only part of the testing story
‘My’ version of the test automation pyramid (or at least the version I use in my presentations) prominently features what I call exploratory testing. This helps remind me to tell those that are listening that there’s more to testing than automation. I usually call this part of the testing story ‘exploratory testing’, because this is the part where humans explore and evaluate the application under test to inform themselves and others about aspects of its quality. This is what’s often referred to as ‘manual testing’, but I don’t like that term.

As you can see, to me, the test automation pyramid is still a very valuable model (and still a useless guideline) when it comes to me explaining my thoughts on automation, despite all the criticism it has received over the years. I hope I never find myself apologizing for using it again in the future..

16 thoughts on “Why and how I still use the test automation pyramid

  1. Hi Bas,

    Thanks for this informative article, can you please share your experiences/opinion on below wrt BDD + Test Pyramid

    1. Start of sprint team has “3 Amigos Session” and team discusses, clarifies & writes down user stories

    2. When dev & qa is crystal clear about user stories; they can take decision about which story can be automated at which level of Test Pyramid.

    3. This avoids qa automating part / full user story which developer is taking care at unit level itself.

    What’s your opinion on this ?


    • Hey Vikram,

      that sounds like a decent approach. With one caveat, though: I’d put as little effort as possible into automating end-to-end tests in the story as you can get away with. Creating these tests (as you know) is hard and takes a lot of time. And in many cases, reviews, tests and follow-up stories will change or add to how a specific story is implemented (either in the same sprint or in subsequent sprints), requiring additional time to repair the broken E2E tests. You’ll probably be better off waiting some more until the feature has matured.

      If you keep that in mind, your approach makes perfect sense.

  2. Thanks Bas for your inputs.

    But per my understanding with BDD; once stories are discussed and written; they shouldn’t change a lot during sprint.
    Otherwise it means PO/PM is not explaining stories properly and/or understood fully by the team.

    As per you; what should QA do while developers are writing code for these stories during sprint ?

    I agree that E2E automation should be deferred as much as possible till stories are fully implemented but then QA needs to utilize time optimally.

    Please let me know your ideal world scenarios

    • I wasn’t specifically talking about stories changing DURING a sprint. That shouldn’t happen (although it does, too often). But let’s say they don’t.

      Still, there’s a good chance that features implemented in sprint X change, get added to or even get removed in sprints X+1, X+2, etc. If you spend a lot of time creating automated tests for those features in sprint X (and this especially applies to E2E tests), you’re likely just wasting time. New features should always be tested by testers. That’s one of the many reasons automation cannot and will not replace testers.

      As to what testers should do while developers are creating code? Prepare test charters, ensure that they know what to test and what to look for once the feature is delivered, and there’s lots more. There are people that are much more knowledgeable about this than I am, though. I haven’t been in that tester role in an Agile team for ages (and I don’t want to, either). Also, if your developers are not delivering features until the end of the sprint, you’re just doing 2-week waterfall iterations. Hardly Agile.

      There are many good sources of information on testing in an Agile context. I am not one of them 🙂 You could start with reading Lisa Crispin and Janet Gregory’s books on Agile Testing for ideas, or consult people in the Context Driven Testing community. You’ll likely get much more information about testing there.

      • Hi Bas,

        Thanks a ton for your honest opinions which I completely agree with & suggestions about readings as well.

        Kind Regards,

  3. Hi Bas,
    Great post as always!
    About this part: “I explain this often using examples from real life projects, where we chose to do a couple of end-to-end tests to verify that customers could complete a specific sequence of actions, combined with a more extensive set of API tests to verify business logic at a lower level, and why this was a much more effective approach than testing everything through the UI.”
    Can you please provide some of these examples that needed the set of API tests and how to apply Them?

    Thanks a lot!

    • Hey Aya,

      thank you for the kind words!

      My stock case is a client project where we were asked to write automated tests for a webshop that sells electronic cigarettes and related accessories in the US. This being the US, it includes:

      1. Hefty fines in case articles are sold to customers that are not of age, i.e., high damage
      2. 50 different states, which boils down to 50 different sets of business rules, combined with multiple age groups and product categories == many different configurations, i.e., high risk

      The original plan they came up with was to check every possible combination of zip code (location), age group and product category through the UI to check whether a customer was correctly (not) able to purchase. Because of the high risk and damage, they really wanted to check all possible combinations (understandably). Each iteration involved completing no less than 7 different forms, times around 6000 test cases = Selenium hell, as you can imagine.

      Turned out that the underlying business logic that decides whether or not someone is allowed to purchase a certain item, based on date of birth and zip code, was exposed through a RESTful API (as is often the case). So, instead of 6000 Selenium test cases, we created 6000 API-level tests (to do the required complete check on the business logic) and < 10 Selenium tests (to ensure that a customer can actually purchase through the UI, there were a couple of happy and rainy scenarios there). Much better.

      • Wow simply superb answer.

        Other advantage from my experience is these API suite can be run on regular intervals with help of jenkins and daily regression at UI level can be run once a day. This is just to make sure critical parts of system are always up & running.

        Also mostly UI is dumb and it only parses response from API and shows to user, takes input from user to call API with that.

        It’s always easier to test api in this situation.


      • Thanks so much for this to the point answer Bas!

        As for our business logic, we need to add tests (then to automate them) for checking that users are created successfully (has many scenarios relating to creation of users and creation of sessions) and some actions they need to do and these actions and their results should be displayed on a user timeline based on each category of actions and and each category of the results of this actions. We decided that we should check this from API level before doing any user and UI scenarios tests and then automating them.

  4. Bas,
    Nice job once again. I tend to look at the pyramid as a set of “guidelines” rather than “rules” as some people believe. You know, like the Pirate Code (Pirates of the Caribbean reference)!

    But you are correct in that Mike Cohn’s whole idea of the Automation Pyramid was an ideal situation to strive for. His emphasis was to have the developers implement their own testing (automation) and do it at the code level which is more atomic/granular. Thus the tests are more of a one-to-one relationship with the code, and thus the base of the pyramid will naturally be wider. By having more Unit Tests you are testing earlier and proving the stability and robustness of the code sooner. This leads into reduced rework later on.

    Then at the upper layers of the pyramid you have tests more focused on what you are trying to prove. It is a “building up” method and process for the testing overall. Each successive layer is a refinement of focus of the testing. That’s my view and opinion.

    But again, solid post on a very relevant topic at this time.


  5. I’m curious to hear what criticism you’re hearing about the test automation pyramid. I drew it on a flip chart at a gathering just last night.

    I’ve been trying to decide what to recommend regarding automated acceptance tests as with BDD. My thinking right now is that I don’t think we should commit to maintaining those tests beyond the iteration they were written for. There are way more acceptance criteria than regression tests we should be maintaining.

    • Hey Danny,

      everybody seems to have an opinion on the pyramid, from it being the absolute truth (including using actual ratios for the different levels) to it being absolutely worthless. And each camp calling the others names. A lot of criticism on the pyramid is quite constructive though, for example this excellent article by John Ferguson Smart.

      With this post I just wanted to show (as the title says) why and how it still is a useful model to me.

      Automated acceptance tests are indeed the trickiest of them all, assuming you mean end-to-end tests. I see too many of those written too soon (especially when creating them is part of the DoD, then people point to that and say ‘we have to’, instead of thinking critically whether or not it is actually useful), leading to lots of overhead and wasted time. In a lot of situations, though, a lean and stable set of automated acceptance tests can still provide great value, if only to verify that no matter what changes are being made to an application, your business critical flows (order completion, subscription management, pretty much everything that has to do with the primary cash flow or business process) can still be completed with the parameters enclosed in the test.

        • Good question.

          I *think* that’s not possible. If a story affects what a user (either a person or a system) sees from a system or how (s)he interacts with it, then an acceptance test is by definition an end-to-end tests. If it’s not an end-to-end test (and E2E might involve the GUI but does not necessarily have to) then how can you be sure that end user expectations are met?

  6. Pingback: Java Weekly, Issue 193 | Baeldung

Leave a Reply

Your email address will not be published. Required fields are marked *