On designing a self-diagnosing and self-healing automated test framework – part 1

This is the first post in a short series on designing robust, self-diagnosing and, if possible, even self-healing tests. You can read all articles in this series by clicking here.

I like automating tests. So much that I have even made my profession out of it. What I don’t like though is wasting a lot of time trying to find out why an automated test failed and whether the failure was a genuine bug, a flaky test or some sort of test environment issue.

A quick Google search showed me that there are a couple of commercial test tools that claim they come with some sort of self diagnosing and even a self healing mechanism, for example this one. I’m very interested to see how such a mechanism works. Skeptic as I am, I don’t really believe it until I’ve seen it. Nonetheless, I think it would be interesting to see what can be done to make existing automated test frameworks more robust by improving their ability to self-diagnose and possible even self-heal in case an unexpected error occurs. So that’s what I’ll try to do in the next couple of posts.

The step by step approach I want to follow and the considerations to be made along the way are (very) loosely inspired on a scientific article titled ‘A Self-Healing Approach for Object-Oriented Applications’, which can be downloaded from here. This article presents an approach and architecture for fault diagnosis and self-healing of interpreted object-oriented applications. As the development and maintenance of automated tests should be treated as any other software development project, I see no reason why the principles and suggestions presented in the article could not apply to a test automation framework, at least to some extent…

When is a system self-diagnosing and self-healing?
Let’s start with the end in mind, as Stephen Covey so eloquently put it. What are we aiming for? In order for a system to be self-diagnosing, it should:

  1. Be self-aware, or more concretely, always have a deterministic state
  2. Recognize the fact that an error has occurred
  3. Have enough knowledge to stabilize itself
  4. Be able to analyze the problem situation
  5. Make a plan to heal itself
  6. Suggest healing solutions to the system administrator (in this case, the person responsible for test automation)

If we want our system not only to be self-diagnosing but also self-healing, the system should also:

  1. Heal itself without human intervention

In this post – and probably some future posts as well – I will try and see whether it is possible to design a generic approach for creating robust, self-diagnosing and self-healing test automation frameworks. I’ll try and include meaningful examples based on a relatively simple Selenium test framework wherever possible.

Self-aware tests
The most straightforward way to make any piece of software self-aware is to introduce the concept of state. A robust program, and therefore an automated test as well, should always be in a specific state when it is executing. For robustness, we will assume that the state model for an automated test is deterministic, i.e., a test can never be in more than one state, and an event that triggers a state transition should always result in the test ending up in a single new state. Let’s say we identify the following states that an automated test can be in:

  • Not running
  • Initialization
  • Running
  • Error
  • Teardown

The state model or state transition diagram could then look like this:

A first state model for our automated test

A sample implementation of this state model (also known as a finite state machine or FSM) can be created using a Java enum:

public enum State {
        State doTransition(String input) {
            System.out.println("Going from State.NOT_RUNNING to State.INITIALIZATION");
            return INITIALIZATION;
        State doTransition(String input) {
        	if (input.equals("error")) {
        		System.out.println("Going from State.INITIALIZATION to State.ERROR");
        		return ERROR;
        	} else {
        		System.out.println("Going from State.INITIALIZATION to State.RUNNING");
        		return RUNNING;
    // The RUNNING and TEARDOWN states are implemented in the same way as INITIALIZATION state
    ERROR {
        State doTransition(String input) {
        	if (input.equals("ok")) {
        		System.out.println("Going from State.ERROR to State.NOT_RUNNING");
        		return NOT_RUNNING;
        	} else {
        		System.out.println("Remaining in State.ERROR");
        		return this;
    abstract State doTransition(String input);

In a next post, I will show how we can apply this state model to a simple Selenium WebDriver test to make it more robust. I will also demonstrate how this state model helps us in letting our tests fail gracefully and in determining what exactly constitutes an error (versus a failed check, for example).

13 thoughts on “On designing a self-diagnosing and self-healing automated test framework – part 1

  1. Hi bas,
    This is bit tough for me as i am a beginner.And please correct if i got this all wrong ok?Here is what i understood.
    Did you mean something like if the input given to the application by the framework accidentally was not right or error was produced from situations similar to this framework will automatically identify it and say this is what went wrong here so you could try this instead or something like that?

    • Hi Sherin,

      that’s pretty much spot on. What I want to investigate (remember that this is still work in progress for me as well) is whether it is possible to make test automation frameworks±

      1. More robust (prevent failure and if that is not possible, fail gracefully)
      2. Self diagnosing (analyse the type of failure that occurred and report back to the user, together with suggested solutions)
      3. Self healing (apply the suggested solution without manual intervention and perform a retest to see whether the suggested solution has worked)

      • Hi Bas,
        I always heard when we use a framework to write automated scripts,it should be robust and reusable.
        By reusable does that mean suppose we have a login test script for a particular website where we take input from an excel sheet.so in one of the excel sheet column suppose we have given the website url as an input.so if we just change the url and gave another website link which also has a login module,the script should work.Am i right and to achieve this how we should modify our script?

    • Hi Dattaprasad,

      Thanks. I don’t think a second episode is coming anymore though. I took this one offline as well and only republished it after requests from others, but to be honest I’m not too happy with it anymore and don’t see what a second post would look like. At least not for now…

Leave a Reply

Your email address will not be published. Required fields are marked *