Choose wisely

In a recent article that was published on TechBeacon, I argued that writing tests at the API level often hits the sweet spot between speed of execution, stability (both in terms of execution and maintenance required) and test coverage. What I didn’t write about in this article is what motivated me to write the article in the first place, so I thought it might be a good idea to dedicate a blog post to the reason behind the piece.

There really was only a single reason for me to suggest the topic to the people at TechBeacon: because I see things go wrong too often when people start creating automated tests. I’m currently working with a number of clients in two separate projects, and what a lot of them seem to have in common is that they revert instantly to end-to-end tests (often using a tool like Selenium or Protractor) to create the checks that go beyond the scope of unit tests.

As an example, I’m working on a project where we are going to create automated checks for a web shop that sells electronic cigarettes and related accessories in the United States. There are several product categories involved, several customer age groups to be considered (some products can be purchased if you’re over 18, some over 21, some are fit for all ages, etc.), and, this being the US, fifty different states, each with their own rules and regulations. In short, there’s a massive amount of possible combinations (I didn’t do the math yet, but it’s easily in the hundreds). Also, due to the strict US regulations, and more importantly the fines associated with violating these rules, they want all relevant combinations included in the automated test.

Fair enough, but the problem started when they suggested we write an automated end-to-end test case for each of the possible combinations. That means creating an order for every combination of product group, age group and state, and every order involves filling out three or four separate forms and some additional more straightforward web page navigation. In other words, this would result in a test that would be slow to execute (we’re talking hours here) and possibly quite hard to maintain as well.

Instead, I used Fiddler to analyze what exactly it was that the web application did in order to determine if a customer could order a given product. Lo and behold.. it simply called an API that exposed the business logic used to make this decision. So, instead of creating hundreds of UI-driven test cases, I suggested to create API-level tests that would verify the business logic configuration, and add a couple of end-to-end tests to verify that a user can indeed place an order successfully, as well as receive an error message in case he or she tries to order a product that’s not available for a specific reason.

We’re still working on this, but I think this case illustrates my point fairly well: it often pays off big time to look beyond the user interface when you’re creating automated tests for web applications:

  • Only use end-to-end tests to verify whether a user of your web application can perform certain sequences of actions (such as ordering and paying for a product in your web shop).
  • See (ask!) whether business logic hidden behind the user interface can be accessed, and therefore tested, at a lower (API or unit) level, thereby increasing both stability and speed of execution.

For those of you familiar with the test automation pyramid, this might sound an awful lot like a stock example of the application of this model. And it is. However, in light of a couple of recent blog posts I read (this one from John Ferguson Smart being a prime example) I think it might not be such a good idea to relate everything to this pyramid anymore. Instead, I agree that what it comes down to (as John says) is to get clear WHAT it is that you’re trying to test and then write tests on the right level. If that leads to an ice cream cone, so be it. If only because I like ice cream..

This slightly off-topic remark about the test automation pyramid notwithstanding, I think the above case illustrates the key point I’m trying to get across fairly well. As I’ve said before, creating the most effective automated tests comes down to:

  • First, determining why you want to automate those tests in the first place. Although that’s not really the subject of this blog post, it IS the first question that should be asked. In the example in this post, the why is simple: because the risk and impact of fines imposed in case of the sale of items to groups of people that should not be allowed to is high enough to warrant thorough testing.
  • Then, deciding what to test. In this case, it’s the business logic that determines whether or not a customer is allowed to purchase a given item, based on state of residence, product ID and date of birth.
  • Finally, we get to the topic of this blog post, the question of how to test a specific application or component. In this case, the business logic that’s the subject of our tests is exposed at the API level, so it makes sense to write tests at that level too. I for one don’t feel like writing scenarios for hundreds of UI-level tests, let alone run, monitor and maintain them..

I’m sure there are a lot of situations in your own daily work where reconsidering the approach taken in your automated tests might prove to be beneficial. It doesn’t have to be a shift from UI to API (although that’s the situation I most often encounter), it could also be writing unit tests instead of end-to-end user interface-driven tests. Or maybe in some cases replacing a large number of irrelevant unit tests with a smaller number of more powerful API-level integration tests. Again, as John explained in his LinkedIn post, you’re not required to end up with an actual pyramid, as long as you end up with what’s right for your situation. That could be a pyramid. But it could also not be a pyramid. Choose (and automate) wisely.

Step-by-step integration testing: a case study

For the last 9 months or so, I have been working as a tester on a project where we develop and deliver the supply chain suite connected to a brand new highly automated warehouse for a big Dutch retailer. As with so many modern software development projects, we have to deal with a lot of different applications and the information that is exchanged between them. For example, the process of ordering a single article on the website and then processing it all the way until the moment it is on your doorstep involves 10+ applications and multiple times that amount of XML messages.

Testing whether all these applications communicate correctly with one another is not simply a matter of placing an order and seeing what happens. It requires structured testing and a bottom-up approach, starting with the smallest level of integration and moving up until the complete set of applications involved is exercised. In this post, I will try and sketch how we have done this using three levels of integration testing.

First, here’s a highly simplified overview of what the application landscape looks like. On one side we have the supply chain suite and all other applications containing and managing necessary information. On the other side there’s the warehouse itself. I have been mostly involved in testing the former, but now that we’re into the final stages of the project, I am also involved (to an extent) in integration testing between both sides.

The application landscape

Level 1: message-level integration testing
The first level of integration testing that we perform is on the message level. On this level, we check whether message type XYZ can be sent successfully from application A to application B. Virtually all application integration is done using the Microsoft BizTalk platform. To create and perform these tests, we use BizUnit, a test tool specifically designed for testing BizTalk integrations. Every test follows the same procedure:

  1. Prepare the environment by cleaning relevant input and output queues and file locations
  2. Place the message type to be tested on the relevant BizTalk receive location (a queue or RESTful web service)
  3. Validate whether the message has been processed successfully by BizTalk and placed on the correct send location.
  4. Check whether the messages have been archived properly for auditing purposes
  5. Rinse and repeat for other message flows

Note that on this test level, no checks are performed on the contents of messages. The only checks that are performed concern message processing and routing. BizTalk does not do anything or even care for message contents, it only processes and routes messages based on XML header information. Therefore, it does not make sense to perform message content validations here.

Scope of the BizUnit message level tests

Level 2: business process-level integration
The second level of integration testing that is performed focuses on successful completion of different business processes, such as customer orders, customer returns, purchasing, etc. These business processes involve the exchange of information between multiple applications. As the warehouse management system is developed in parallel, that interface is simulated using a custom built simulation tool. On a side note: this is a form of service virtualization.

Tests on this level involve triggering the relevant business process and tracking the process instance as related messages pass through the applications involved. For example, a test on the customer order process is started by creating a new order and verifying amongst other things if the order:

  • can be successfully picked and shipped by the warehouse simulator,
  • is created, updated and closed correctly by the supply chain suite,
  • is administrated correctly in the order manager,
  • triggers the correct stock movements, and
  • successfully triggers the invoicing process

Scope of the business process tests

A number of tests on this level have been automated using FitNesse. For me, this was the first project where I had to use FitNesse, and while it does the job, I haven’t exactly fallen in love with it yet. Maybe I just don’t know enough about how to use it properly?

Level 3: integration with warehouse
The third and final level of integration testing was done on the interface between the supply chain suite and connecting applications and the actual warehouse itself. As both systems were developed in parallel, it took quite a bit of time before we finally were able to test this interface properly. And even though our warehouse simulation had been designed and implemented very carefully, and it certainly did a lot for us in speeding up the development process, the first integration tests showed that there is no substitute for the real thing. After lots of bug fixing and retesting, we were able to successfully complete this final level of integration testing.

Scope of the warehouse integration tests

For this final level of integration testing, we were not able to use automated tests due to the time required by the warehouse to physically pick and ship the created orders. It would not have made sense to build automated tests that have to wait for an hour or more before a response indicating the order has been shipped is returned from the warehouse. The test cases executed mostly follow the same steps as those in level 2 as they are also focused on executing business processes.

I hope this post has given you some ideas on how to break down the task of integration testing for a large and reasonably complex application landscape. Thinking in different integration levels has certainly helped me to determine which steps and which checks to include in a given test scenario.