More troubles with test data

If managing test data in complex end-to-end automated test scenarios is an art, I’m certainly not an artist (yet). If it is a science, I’m still looking for the solution to the problem. At this moment, I’m not even sure which of the two it is, really..

The project
Some time ago, I wrote a blog post on different strategies to manage test data in end-to-end test automation. A couple of months down the road, and we’re still struggling. We are faced with the task of writing automated user interface-driven tests for a complex application. The user interface in itself isn’t that complex, and our tool of choice handles it decently. So far, so good.

As with all test automation projects I work on, I’d like to keep the end goal in mind. For now, running the automated end-to-end tests once every fortnight (at the end of a sprint) is good enough. I know, don’t ask, but the client is satisfied with that at the moment. Still, I’d like to create a test automation solution that can be run on demand. If that’s once every two weeks, all right. It should also be possible to run the test suite ten times per day, though. Shorten the feedback loops and all that.

The test data challenge
The real challenge with this project, as with a number of other projects I’ve worked on in the past, is in ensuring that the test data required to successfully run the tests is present and in the right state at all times. There are a number of complicating factors that we need to deal (or live) with:

  • The data model is fairly complex, with a large number of data entities and relations between them. What makes it really tough is that there is nobody available that completely understands it. I don’t want to mess around assuming that the data model looks a certain way.
  • As of now, there is no on demand back-up restoration procedure. Database back-ups are made daily in the test environment, but restoring them is a manual task at the moment, blocking us from recreating a known test data state whenever we want to.
  • There is no API that makes it easy for us to inject and remove specific data entities. All we have is the user interface, which results in long setup times during test execution, and direct database access, which isn’t of real use since we don’t know the data model details.

Our current solution
Since we haven’t figured out a proper way to manage test data for this project yet, we’re dealing with it the easiest way available: by simply creating the test data we need for a given test at the start of that test. I’ve mentioned the downsides of this approach in my previous post on managing test data (again, here it is), but it’s all we can do for now. We’re still in the early stages of automation, so it’s not something that’s holding us back to much, but all parties involved realize that this is not a sustainable solution for the longer term.

The way forward
What we’re looking at now is an approach that looks roughly like this:

  1. A database backup that contains all test data required is created with every new release.
  2. We are given permission to restore that database backup on demand, a process that takes a couple of minutes and currently is not yet automated.
  3. We are given access to a job that installs the latest data model configuration (this changes often, sometimes multiple times per day) to ensure that everything is up to date.
  4. We recreate the test data database manually before each regression test run.

This looks like the best possible solution at the moment, given the available knowledge and resources. There are still some things I’d like to improve in the long run, though:

  • I’d like database recreation and configuration to be a fully automated process, so it can more easily be integrated into the testing and deployment process.
  • There’s still the part where we need to make sure that the test data set is up to date. As the application evolves, so do our test cases, and somebody needs to make sure that the backup we use for testing contains all the required test data.

As you can see, we’re making progress, but it is slow. It makes me realize that managing test data for these complex automation projects is possibly the hardest problem I’ve encountered so far in my career. There’s no one-stop solution for it, either. So much depends on the availability of technical hooks, domain knowledge and resources at the client side.

On the up side, last week I met with a couple of fellow engineers from a testing services and solutions provider, just to pick their brain on this test data issue. They said they have encountered the same problem with their clients as well, and were working on what could be a solution to this problem. They too realize that it’ll never be a 100% solution to all test data issues for all organizations, but they’re confident that they can provide them (and consultants like myself) with a big step forwards. I haven’t heard too many details, but I know they know what they’re talking about, so there might be some light at the end of the tunnel! We’re going to look into a way to collaborate on this solution, which I am pretty excited about, since I’d love to have something in my tool belt that helps my clients tackle their test data issues. To be continued!

On false negatives and false positives

In a recent post, I wrote about how trust in your test automation is needed to create confidence in your system under test. In this follow up post (of sorts), I’d like to take a closer look at two phenomena that can seriously undermine this trust: false positives and false negatives.

False positives
Simply put, false positives are test instances that fail without there being a defect in the application under test, i.e., the test itself is the reason for the failure. False positives can occur for a multitude of reasons, including:

  • No appropriate waiting is implemented for an object before your test (for example written using Selenium WebDriver) is interacting with it.
  • You specified incorrect test data, for example a customer or an account number that is (with reason) not present in the application under test.

False positives can be really annoying. It takes time to analyze their root cause, which wouldn’t be so bad if the root cause was in the application under test, but that would be an actual defect, not a false positive. The minutes (hours) spent getting to the root cause of tests that fail because they’ve been poorly written would almost always have been better spent otherwise, on writing stable and better performing tests in the first place, for example.

If they’re part of an automated build and deployment process, you can find yourself in even bigger trouble with false positives. They stall your build process unnecessarily, thereby delaying deployments that your customers or other teams are waiting for.

There’s also another risk associated with false positives: when they’re not taken care of as soon as possible, people will start taking them for granted. Tests that regularly or consistently cause false positives will be disregarded and, ultimately, left alone to die. This is a real waste of the time it took to create that test in the first place, I’d say. I’m all for removing tests from your test base if they no longer serve a purpose, but removing them just because they fail either intermittently or every time is NOT a good reason to discard them.

And talking about tests failing intermittently, those (also known as flaky tests) are the worst, simply because the root cause for their failure often cannot be easily determined, which in turn makes it hard to fix them. As an example: I’ve seen tests that ran overnight and failed sometimes and passed on other occasions. It took weeks before I found out what caused them to fail: on some test runs this particular test (we’re talking about an end to end test that took a couple of minutes to run here) was started just before midnight, causing funny behavior in subsequent steps that were completed after midnight (when a new day started). On other test runs, either the test started after midnight or was completed before midnight, resulting in a ‘pass’. Good luck debugging that during office hours!

False negatives
While false positives can be really annoying, the true risk with regards to trust in test automation is at the other end of the unreliability spectrum: enter false negatives. These are tests that pass but shouldn’t, because there IS an actual defect in the application under test, it’s just not picked up by the test(s) responsible for covering the area of the application where the defect occurs.

False negatives are far more dangerous than false positives, since they instill a false sense of confidence in the quality of your application. You think you’ve got everything covered with your test set, and all lights are green, but there’s still a defect (or two, or ten, or …) that goes unnoticed. And guess who’s going to find them? Your users. Which is exactly what you thought you were preventing by writing your automated regression tests.

Detecting false negatives is hard, too, since they don’t let you know that they’re there. They simply take up their space in your test set, running without trouble, never actually catching a defect. Sometimes, these false negatives are introduced at the time of writing the test, simply because the person responsible for creating the tests isn’t paying attention. Often, though, false negatives spring into life over time. Consider the following HTML snippet, representing a web page showing an error message:

<html>
	<body>
		...
		<div class="error">Here's a random error message</div>
		...
	</body>
</html>

One of the checks you’re performing is that no error messages are displayed in a happy path test case:

@Test(description="Check there is no error message when login is successful")
public void testSuccessfulLogin() {
		
	LoginPage lp = new LoginPage();
	HomePage ep = lp.correctLogin("username", "password");
	Assert.assertTrue(hp.hasNoErrorText());
}

public bool hasNoErrorText() {

	return driver.findElements(By.className("error")).size() == 0;
}

Now consider the event that at some point in time, your developer decides to mix up class annotations (not a highly unlikely event!), which results in a new way of displaying error messages:

<html>
	<body>
		...
		<div class="failure">Here's a random error message</div>
		...
	</body>
</html>

The aforementioned test will still run without trouble after this change. The only problem is that its defect finding ability is gone, since it wouldn’t notice in case an error message IS displayed! Granted, this might be an oversimplified example, and a decent test set would have additional tests that would fail after the change (because they expect an error message that is no longer there, for example), but I hope you can see where I’m going with this: false negatives can be introduced over time, without anyone knowing.

So, how to reduce (or even eliminate) the risk of false negatives? If you’re dealing with unit tests, I highly recommend experimenting with mutation testing to assess the quality of your unit test set and its defect finding ability. For other types of automated checks, I recommend the practice of checking your checks regularly. Not just at the time of creating your tests, though! Periodically, set some time aside to review your tests and see if they still make sense and still possess their original defect finding power. This will reduce the risk of false negatives to a minimum, keeping your tests fresh and preserving trust in your test set and, by association, in your application under test. And isn’t that what testing is all about?

First steps as a test automation coach

“We want our development teams to take the next step towards adopting Continuous Delivery by giving their test automation efforts a boost.”

That was the task I was given a couple of months ago when I started a new project, this one for a well-known media company here in the Netherlands. Previously, I’ve mainly been involved in more hands-on test automation implementation projects, meaning I was usually the one designing and implementing the test automation solution (either alone or as part of a team). For this project, however, my position would be much different:

  • There were multiple development teams to be supported (the exact number changed a couple of times during the assignment, but there were at least four at any time), meaning there was no way I was able to spend enough time on the implementation of automated tests for any of those teams.
  • This was a part time assignment, since I only had 2 days per week available due to other commitments. This made it even less possible to get involved in any serious test automation implementation myself.
  • Each development team was responsible for its own line of products and could make their own decisions on the technology stack to be used (within a certain bandwidth), most of which I’ve never worked with or even heard of before (GraphQL, for example), making it even less feasible to contribute to any actual tests.

Instead, at the start of the project, we decided that I would act more as a test automation coach of sorts, leaving the creation of the test automation to the development teams (which made perfect sense for this client). Something I’d never done before, so the fact that I was given the chance to do so was a pleasant surprise. Normally, as a contractor, I’m only hired to do stuff I’m already experienced in, but I guess that through my resume and the interview I built enough trust for them to hire me for the job anyway. I’m very grateful for that.

So, what did I do?

Kickoff
As the development teams consisted of about 40 developers in total, with a wide range of levels of experience, background and preferences in technology and programming languages, we (the hiring manager, the client’s team of testers and myself) thought it would be a good idea to get them at least somewhat on the same level with regards to the concept of test automation. We did this by organizing a number of test automation awareness sessions, in which I presented my view on test automation. The focus of this presentation was mostly on the ‘why?’ and the ‘what?’ of it, because I quickly figured out that the developers themselves were perfectly capable of figuring out the ‘how?’ (an impression I get from a lot of my clients nowadays, by the way).

Taking inventory of test automation maturity
Next up was a series of interviews with all tech leads from the development teams, to see where they stood with regards to test automation, what they already did, what was lacking and what would be a good ‘next step’ to allow that team to make the transition towards Continuous Delivery. This information was shared across teams to promote knowledge sharing. You’d be surprised to find out how often teams are struggling with something that’s already been solved by another teams just a couple of yards away, without either party knowing of the situation of the other..

Test automation hackathons
The most important and most impactful part of my assignment was organizing a two-day ‘hackathon’ (for lack of a better word) for each of the teams (one team at a time). The purpose of this hackathon was to take the team away from the daily grind of developing and delivering production code and have them work on their technical debt with regards to test automation. The rules of the game:

  • Organize the hackathon in a space separate from the work floor, both to give the team the feeling that they were removed from the usual work routine as well as prevent outside interference as much as possible.
  • Organize the hackathon as a Scrum sprint of sorts, with a kickoff/planning session, show and tell/standup twice a day and a demo session and retrospective at the end.
  • Deliver working software, meaning that I’d rather have one test that works and is fully integrated into the build and deployment process than fifty tests that do not run automatically. The most difficult hurdles are never in creating more tests, once you’ve got the groundwork taken care of.
  • Focus on a subject that the teams wants to have, but does not currently have. For some teams, this was unit testing for a specific type of software, for some it was end-to-end testing, or build pipelines, and in one case production monitoring. The subject didn’t matter, as long as it had to do with software quality and it was something the team did not already do.

Results
The hackathons worked out really well. Monitoring the teams after they had completed their ‘two days of test automation’ I could see they had indeed taken a step in the right direction, being more active on the test automation front and having a better awareness of what they were working towards. Mission accomplished!

As my assignment ended after that, I can’t say anything about the long term effects, unfortunately, but I’m convinced that the testers themselves can take over the role of test (automation) coach perfectly well. I will stay in touch with the client regularly to see how they’re doing, of course.

What did I learn?
As I said, this was my first time acting more as a coach than as an engineer, so naturally I learned a lot of things myself:

  • Hackathons are a great way of improving test automation efforts for development teams. Pulling teams away from their daily grind and having them focus on improving their automation efforts is both useful and fun. I was lucky that management support was not an issue, so your mileage may vary, but my point stands.
  • I (think I) have what it takes to be a test automation coach. This was the biggest breakthrough for me personally. As a pretty introverted person who likes to play around with tools regularly, it was hard initially to step away from the keyboard and fight the urge to create tests myself, helping other people to become better at it instead. It IS the way forward for my career, though, I think, because I’ve yet again seen that there’s no one better at creating automated tests than a (good) developer. What I can bring to the table is experience and guidance as to the ‘why?’ and ‘what?’.
  • Part time projects are great in terms of flexibility, especially when you find yourself in a coaching role. You can organize a hackathon, give teams guidance points and suggestions what to work on, and come back a couple of days later and see how they’re doing, evaluate, discuss and let them take the next step.

In short, my first adventure as a test automation coach has been a great experience. I’m looking forward to the next one!