On including automation in your Definition of Done

Working with different teams in different organizations means that I’m regularly faced with the question of whether and how to include automation in the Definition of Done (DoD) that is used in Agile software development. I’m not an Agilist myself per se (I’ve seen too many teams get lost in overly long discussions on story points and sticky notes), but I DO like to help people and teams struggling with the place of automation in their sprints. As for the ‘whether’ question: yes, I definitely think that automation should be included in any DoD. The answer to the ‘how’ of including, a question that could be rephrased as the ‘what’ to include, is a little more nuanced.

For starters, I’m not too keen on rigid DoD statements like

  • All scenarios that are executed during testing and that can be automated, should be automated
  • All code should be under 100% unit test coverage
  • All automated tests should pass at least three consecutive times, except on Mondays, when they should pass four times.

OK, I haven’t actually seen that last one, but you get my point. Stories change from sprint to sprint. Impact on production code, be it new code that needs to be written, existing code that needs to be updated or refactored or old code that needs to be removed (my personal favorite) will change from story to story, from sprint to sprint. Then why keep statements regarding your automated tests as rigid as the above examples? Doesn’t make sense to me.

I’d rather see something like:

Creation of automated tests is considered and discussed for every story and their overarching epic and applied where deemed valuable. Existing automated tests are updated where necessary, and removed if redundant.

You might be thinking ‘but this cannot be measured, how do we know we’re doing it right?’. That’s a very good question, and one that I do not have a definitive answer for myself, at least not yet. But I am of the opinion that knowing where to apply automation, and more importantly, where to refrain from automation, is more of an art than a science. I am open to suggestions for metrics and alternative opinions, of course, so if you’ve got something to say, please do.

Having said that, one metric that you might consider when deciding whether or not to automate a given test or set of tests is whether or not your technical debt increases or decreases. The following consideration might be a bit rough, but bear with me. I’m sort of thinking out loud here. On the one hand, given that a test is valuable, having it automated will shorten the feedback loop and decrease technical debt. However, automating a test takes time in itself and increases the size of the code base to be maintained. Choosing which tests to automate is about finding the right balance with regards to technical debt. And since the optimum will likely be different from one user story to the next, I don’t think it makes much sense to put overly generalizing statements with regards to what should be automated in a DoD. Instead, for every story, ask yourself

Are we decreasing or increasing our technical debt when we automate tests for this story? What’s the optimum way of automating tests for this story?

The outcome might be to create a lot of automated tests, but it might also be to not automate anything at all. Again, all depending on the story and its contents.

Another take on the question whether or not to include automated test creation in your DoD might be to discern between the different scope levels of tests:

  • Creating unit tests for the code that implements your user story will often be a good idea. They’re relatively cheap to write, they run fast and thereby, they’re giving you fast feedback on the quality of your code. More importantly, unit tests act as the primary safety net for future development and refactoring efforts. And I don’t know about you, but when I undertake something new, I’d like to have a safety net just in case. Much like in a circus. I’m deliberately refraining from stating that both circuses and Agile teams also tend to feature a not insignificant number of clowns, so forget I said that.
  • You’ll probably also want to automate a significant portion of your integration tests. These tests, for example executed at the API level, can be harder to perform manually and are relatively cheap to automate with the right tools. They’re also my personal favorite type of automated tests, because they’re at the optimum point between scope and feedback loop length. It might be harder to write integration tests when the component you’re integrating with is outside of your team’s control, or does not yet exist. In that case, simulation might need to be created, which requires additional effort that might not be perceived as directly contributing to the sprint. This should be taken into account when it comes to adding automated integration tests to your DoD.
  • Finally, there’s the end-to-end tests. In my opinion, adding the creation of this type of tests to your DoD should be considered very carefully. They take a lot of time to automate (even with an existing foundation), they often use the part of the application that is most likely to change in upcoming sprints (the UI), and they contribute the least to shortening the feedback loop.

The ratio between tests that can be automated and tests for which it make sense to be automated in sprint can be depicted as follows. Familiar picture?

Should you include automated tests in your Definition of Done?

Please note that like the original pyramid, this is a model, not a guideline. Feel free to apply it, alter it or forget it.

Jumping back to the ‘whether’ of including automation in your DoD, the answer is still a ‘yes’. As can be concluded from what I’ve talked about here, it’s more of a ‘yes, automation should have been considered and applied where it provides direct value to the team for the sprint or the upcoming couple of sprints’ rather than ‘yes, all possible scenarios that we’ve executed and that can be automated should have been automated in the sprint’. I’d love to hear how other teams have made automation a part of their DoD, so feel free to leave a comment.

And for those of you who’d like to see someone else’s take on this question, I highly recommend watching this talk by Angie Jones from the 2017 Quality Jam conference:

Creating executable specifications with Spectrum

One of the most important features of a good set of automated tests is that they require a minimal amount of explanation and/or documentation. When I see someone else’s test, I’d like to be able to instantly see what its purpose is and how it’s constructed in terms of setting up > execution > verification (for example using given/when/then or arrange/act/assert). That’s one of the reasons that I’ve been using Cucumber (or SpecFlow, depending on the client) in several of my automated test solutions, even though the team wasn’t doing Behaviour Driven Development.

It’s always a good idea to look for alternatives, though. Last week I was made aware of Spectrum, which is, as creator Greg Haskins calls it “A BDD-style test runner for Java 8. Inspired by Jasmine, RSpec, and Cucumber.”. I’ve been working with Jasmine a little before, and even though it took me a while to get used to the syntax and lambda notation style, I liked the way it provided a way to document your tests directly in the code and produce human readable results. Since Spectrum is created as a Jasmine-like test runner for Java, plus it provides support for the (at least in the Java world) much more common Gherkin given/when/then syntax, I thought it’d be a good idea to check it out.

Spectrum is ‘just’ yet another Java library, so adding it to your project is a breeze when using Maven or Gradle. Note that since Spectrum uses lambda functions, it won’t work unless you’re using Java 8. Spectrum runs on top of JUnit, so you’ll need that too. It works with all kinds of assertion libraries, so if you’re partial to Hamcrest, for example, that’s no problem at all.

As I said, Spectrum basically supports two types of specs: the Jasmine-style describe/it specification and the Gherkin-style given/when/then specification using features and scenarios. Let’s take a quick look at the Jasmine-style specifications first. For this example, I’m resorting once again to REST Assured tests. I’d like to verify whether Max Verstappen is in the list of 2017 drivers in one test, and whether both Fernando Alonso and Lewis Hamilton are in that list too in another test. This is how that looks like with Spectrum:

@RunWith(Spectrum.class)
public class SpectrumTests {{

	describe("The list of drivers for the 2017 season", () -> {

		ValidatableResponse response = get("http://ergast.com/api/f1/2017/drivers.json").then();

		String listOfDriverIds = "MRData.DriverTable.Drivers.driverId";

		it("includes Max Verstappen", () -> {

			response.assertThat().body(listOfDriverIds, hasItem("max_verstappen"));
		});

		it("also includes Fernando Alonso and Lewis Hamilton", () -> {

			response.assertThat().body(listOfDriverIds, hasItems("alonso","hamilton"));
		});
	});
}}

Since Spectrum runs ‘on top of’ JUnit, executing this specification is a matter of running it as a JUnit test. This results in the following output:

Spectrum output for Jasmine-style specs

Besides this (admittedly quite straightforward) example, Spectrum also comes with support for setup (using beforeEach and beforeAll) and teardown (using afterEach and afterAll), focusing on or ignoring specific specs, tagging specs, and more. You can find the documentation here.

The other type of specification supported by Spectrum is the Gherkin syntax. Let’s say I want to recreate the same specifications as above in the given/when/then format. With Spectrum, that looks like this:

@RunWith(Spectrum.class)
public class SpectrumTestsGherkin {{

	feature("2017 driver list verification", () -> {

		scenario("Verify that max_verstappen is in the list of 2017 drivers", () -> {

			final Variable<String> endpoint = new Variable<>();
			final Variable<Response> response = new Variable<>();

			given("We have an endpoint that gives us the list of 2017 drivers", () -> {
				
				endpoint.set("http://ergast.com/api/f1/2017/drivers.json");
			});

			when("we retrieve the list from that endpoint", () -> {
				
				response.set(get(endpoint.get()));
			});
			then("max_verstappen is in the driver list", () -> {
				
				response.get().then().assertThat().body("MRData.DriverTable.Drivers.driverId", hasItem("max_verstappen"));
			});
		});

		scenarioOutline("Verify that there are also some other people in the list of 2017 drivers", (driver) -> {

			final Variable<String> endpoint = new Variable<>();
			final Variable<Response> response = new Variable<>();

			given("We have an endpoint that gives us the list of 2017 drivers", () -> {
				
				endpoint.set("http://ergast.com/api/f1/2017/drivers.json");
			});

			when("we retrieve the list from that endpoint", () -> {
				
				response.set(get(endpoint.get()));
			});
			then(driver + " is in the driver list", () -> {
				
				response.get().then().assertThat().body("MRData.DriverTable.Drivers.driverId", hasItem(driver));
			});
		},

		withExamples(
				example("hamilton"),
				example("alonso"),
				example("vettel")
			)
		);
	});
}}

Running this spec shows that it does work indeed:

Spectrum output for Gherkin-style specs

There are two things that are fundamentally different from using the Jasmine-style syntax (the rest is ‘just’ syntactical):

  • Support for scenario outlines enables you to create data driven tests easily. Maybe this can be done too using the Jasmine syntax, but I haven’t figured it out so far.
  • If you want to pass variables between the given, the when and the then steps you’ll need to do so by using the Variable construct. This works with the Jasmine-style syntax too, but you’ll likely need to use it more in the Gherkin case (since ‘given/when/then’ are three steps, where ‘it’ is just one). When your tests get larger and more complex, having to use get() and set() every time you want to access or assign a variable might get cumbersome.

Having said that, I think Spectrum is a good addition to the test runner / BDD-supporting tool set available for Java, and something that you might want to consider using for your current (or next) test automation project. After all, any library or tool that makes your tests and/or test results more readable is worth taking note of. Right?

You can find a small Maven project containing the examples featured in this blog post here.

On crossing the bridge into unit testing land

Maybe it’s just the people and organizations I meet and work with, but no matter how active they’re trying to implement automation and involve testers therein, there’s one bridge that’s often too far for those that are tasked with test automation, and that’s the bridge to unit testing land. When asking them for the reasons that testers aren’t involved in unit testing, I typically get one (or more, or all) of the following answers:

  • ‘That’s the responsibility of our developers’
  • ‘I don’t know how to write unit tests’
  • ‘I’m already busy with other types of automation and I don’t have time for that’

While these answers might sound perfectly reasonable to some, I think there’s something inherently wrong with all of them. Let’s take a look:

  • With more and more teams becoming multidisciplinary, we can’t simply shift responsibility for any task to a specific subgroup. If ‘we’ (i.e., the testers) keep saying that unit testing is a developer’s responsibility, we’ll never get rid of the silos we’re trying to break down.
  • While you might not know how to actually write unit tests yourself, there’s a lot you CAN do to contribute to their value and effectiveness. Try reviewing them, for example: has the developer of the unit test missed some obvious cases?
  • Not having time to concern yourself with unit testing reminds me of the picture below. Really, if something can be covered with a decent set of unit tests, there really is no need to write integration or even (shudder) end-to-end tests for it.

Are you too busy to pay attention to unit testing?

I’m not a devotee of the test automation pyramid per se, but there IS a lot of truth to the concept that a decent set of unit tests should be the foundation of every solid test automation effort. Unit tests are relatively easy to write (even though it might not look that way to some), they run fast (no need for waiting until web pages are loaded and complete their client-side processing, for example..) and therefore they’re the best way to provide that fast feedback that development teams are looking for those in this age of Continuous Integration / Delivery / Deployment / Testing / Everything / … .

To put it in even bolder terms, as a tester, I think you have the responsibility of familiarizing yourself with the unit testing activities that your development team is undertaking. Offer to review them. Try to understand what they do, what they check and where coverage could be improved. Yes, this might require you to actually talk to your developers! But it’ll be worth it, not just to you, but to the whole team and, in the end, also to your product and your customers. Over time, you might even write some unit tests yourself, though, again, that’s not a necessity for you to provide value in the land of unit testing. Plus, you’ll likely learn some new tricks and skills by doing so, and that’s always a good thing, right?

For those of you looking for another take on this subject, John Ruberto wrote an article called ‘100 percent unit test coverage is not enough‘, which was published on StickyMinds. A highly recommended read.

P.S.: Remember Tesults, the SaaS solution for storing and displaying test results I wrote about a couple of months ago? The people behind Tesults recently let me know they now offer a free forever plan as well. So if you were interested in using their services but could not afford or justify the investment, it might be a good idea to check their new plan out here. And again, I am in no way, shape or form associated with, nor do I have a commercial interest in Tesults as an organization or a product. I still think it’s a great platform, though.