7 Common Mistakes To Avoid When Writing Tests

I was talking with a fellow programmer the other day about a poor test that we were reviewing and we got onto the subject of what makes a poor test. The test in question had a reliance on a previous test being run, and the problem we encountered was that on some systems the dependent test was run after this test, which caused it to fail.

This also caused some headaches in local development as it couldn't be run in isolation. We had to ensure that both tests were run, in the correct order.

After fixing the tests so that they could be run independently I created a list of some common problems that programmers might come across when writing tests. These rules can be applied to most coding test, not just unit tests or behavioural tests.

Remember as you read this that I have fallen into each and every one of these problems, so rather than feel disheartened you should just look at your tests and see if you can correct any of these problems.

1. Test that rely on other tests

Tests should be atomic and able to run in isolation of other tests.

If your test relies on a precondition in order to pass correctly (which is perhaps a debatable thing itself) then you can use pre-test setup procedures in order to create that precondition. A really bad practice is using the conditions created in one test to follow through to the next test. This creates a difficult situation where your tests are then interdependent and must run in a certain order to pass.

Running tests becomes that much more difficult when tests rely on other tests. When writing code a developer will normally run the same test over and over again as they build up their code. This, after all, is the core of test driven development. A single test should take just a few seconds to run and will give the developer immediate results as to the validity of their code. They can then write code to pass the test.

The full test suite would normally only be run when the developer is happy that their test works and wants to ensure that nothing else on the system is broken as a result of their changes.

If you have dependent tests then the developers only option is to run the entire test suite every time they change anything. This is a problem when some test suites can take upwards of an hour to run. It forces developers to break out of test driven development and write tests without actually running them.

2. Tests that leave behind mess

A test should always leave the environment in the same state that it started. It is perfectly acceptable for a test to create pages or users, but the test framework should remove those items afterwards. This applies to every single test in itself and not the tests as a whole.

The problem with tests leaving behind mess is that you essentially change the nature of the environment you are testing. That means that if your test passes then it might be because it is dependent on another test, which is sometimes difficult to unpick.

There are many strategies to doing this and it largely depends on where and when you run the tests.

If you run you tests on a continuous integration environment then you have the option of creating an isolated test environment to run your tests. Before each test is run you can ensure that the environment is in a known state and then throw that state away once the test completes.

Running tests against a staging environment is quite normal to do, but the problem comes when things are created as the tests progress. This can create a staging environment that is full of test data and therefore becomes messy when trying to compare that environment with production.

Many testing frameworks will contain pre and post testing steps that you can use to create and then destroy objects so that the environment is the same as it was at the start. You can either opt to use a test database or keep track of the items you create so that you can destroy them afterwards.

The ultimate question to ask is if you are prepared to run your tests on a production environment. If the answer is no then maybe you should think about a different strategy for your tests.

3. Fragile tests

A fragile test is one that fails if anything in your code base changes in a way that shouldn't invalidate the test.

Fragile tests are difficult to spot, but you'll see them if anything you change in your code keeps breaking the same test over and over again. If this happens then it's either due to a tight dependency on the system structure or your test is too wide reaching and is testing lots of different things.

A good example of this is when running behavioural testing frameworks like Behat or Nightwatch. These frameworks simulate a user interacting with the site. When doing this you'll often want to click on a button or enter some text into a field and you can easily create fragile tests by using the full path to those elements.

Let's say you have a button on a page that the user needs to click on. You could reference that button using the following path.

body > div > div > .main-content > div > div > form > div .button-wrapper button

This might seem silly, but sort of thing is easily created using browser developer tools where you select an element from the interface and copy the path to the element. I have seen this being done in more than one project.

The issue here is that if any part of the page changes in any way then the test will fail as the system will be unable to find the button. This creates a situation where the test is reliant on the framework of the page to work correctly.

A better solution it to reference the button itself, either using an ID or adding a data attribute to the button. This means that if the rest of the page changes it doesn't break the test since it is still able to spot the button.

Best practice here is to use data attributes as you can prefix them with something to make them stand out as being used for tests. Here we see an submit button that is tagged as a test item.

<input type="submit" value="Save" name="save" data-test="save" />

Using this markup we can create a much simpler selector in the test.

[data-test="save"]

This does mean adding some attributes to your code to do ensure testing works, but it creates more resilient tests.

4. Testing the language and not your code

This one is subtle, but you can often fall into the trap of testing the language itself, rather than testing your code.

If you find yourself testing the outcome of arithmetic, how language constructs operate, or even basic data structures like arrays, then you'll have fallen into this trap. You should be sure that what you are testing is the code you have written and not the underlying language.

A good example of testing the language is testing that classes simply instantiate objects when testing the creation of objects.

class TestColour {
    public function testCreateColour()
    {
        $color = new Color();
        $this->assertInstanceOf(Color::class, $color);
    }
}

Unless your object is created through some discovery mechanism or factory then there is little point in testing to see if a class creates an object of a type. What is being tested here is the languages ability to create an object, and that should be pretty fundamental to any object language.

As I said, this can be subtle and do you can sometimes find yourself falling into this without realising. Take this class for example. It contains a publicly available array that we have initialised as empty.

class Data  {
  public $array = [];
}

We might write some code to ensure that the array initialised as empty and that if we add data to it the array length will increment.

class TestData {
    public function testAddItemToData()
    {
        $data = new Data();
        $this->assertEquals(0, count($data->array));
        $data->array[] = 1;
        $this->assertEquals(1, count($data->array));
    }
}

This passes, but we haven't actually tested anything useful here. All we are doing is testing that the underlying language can handle arrays.

Of course, if we add a method to add items to the array then things become slightly different. We might add an internal length counter to the class or ensure that no array items of the wrong type are added to the array. This is fine to add tests for as it requires that business logic is written to handle those situations.

Key to this is to remember that your tests are meant to test your business and application logic. Tests should never make sure that "1 + 1 = 2" since that is well established as part of the language you are using to test with. If you find yourself testing that then take a step back and look at what you are really testing.

Remember that this also applied to writing tests that exist as part of a third party system. For example, if you have a CMS that prevents users from registering with usernames that already exist then there is no need to write a test for this. This is especially the case when the CMS already has a test for this feature. You only want to write tests for this if you made some change that alters how that section works.

5. Mocking everything

Mocking is a brilliant way of allowing tests to run when all of the dependencies don't (or can't) exist during the life cycle of the test.

You should certainly use mocking when you are faced with things like dependency injection problems where you can't possibly create all of the dependencies within the test. Mocking allows you to abstract away some of the underlying application in order to ensure that your business logic works with the code you have written.

Mocking is also useful when testing API integrations as it allows you to create mock payloads and responses in order to test that the code that integrates against an API can work in different situations.

Problems start to arise when you rely on mocking too much and almost everything in your test is mocked in some way. This is especially the case in content management systems with many components being used across the site.

A page in a content management system, for example, will have objects attached to it in order to handle things like fields, taxonomy terms, configuration items, users, and more. If you write lots of code that mocks everything in the on the page only to run a single line of custom code then you should probably ask yourself what you are actually testing. Your custom code will need a test, that is true, but if the entire environment is mocked then your test probably isn't testing anything.

Ultimately, you need to ask yourself what you are testing if everything in your test is mocked.

6. Randomly failing tests

Tests that pass every single time show that your code base is working as expected.

Having a test that randomly fails erodes trust in your tests and makes you doubt the outcome.

The converse can sometimes be true with randomly failing tests as well. If you have randomly failing tests in your codebase then when the tests all pass then there is a false assumption that everything worked when in reality the tests passing might just be masking a deeper problem with your codebase.

Time and date based tests are a classic example of tests that can fail randomly if not written correctly. Take the following test as an example, which tests that a given date is in the future.

class TestDate {
    public testDateIsInFuture() {
        $date = new Date();
        $date->date = new DateTime('2022-06-01');
        $this->assertTrue($date->date < new DateTime());
    }
}

This works fine as the test passes since the date being tested is in the future, but this is just a failed test waiting to happen. When the 1st of June passes this test will start failing as the hard coded date will not be in the future.

Again, this might seem like a strange test, but I have seen this sort of test in projects before. The developer writing this would have had the best of intentions in mind. What they should have done is to fix both dates here. This ensures that the comparison is between two set points in time.

The use of random functions in your code is another thing that is difficult to test correctly. You might write a function that returns a random string, but your unit tests need to look at the result with a fixed outcome. Making sure that the string is of a fixed length, or has the right number of upper and lowercase characters are good test candidates, as long as you have coded your function to create the string with those parameters.

Behavioural tests on a complex system can be prone to random failures if not done correctly. The most common problem I have seen here is not giving the system enough time to respond to a request, which can cause a timeout error and a fail to occur.

Giving the server enough time to respond to requests can appear to be a solution here, but be careful. This fail might be masking something else happening in the system that is causing the test to fail.

I have seen an example in the past where a test registered a user and performed a task as that user. Quite a common occurrence for user driven websites. Everything was fine until the test started randomly fail with the user being unable to register for some reason.

It turned out that a stronger password requirement was added to the configuration and the registration test was sometimes failing to meet these requirements. This meant that the test never got past that point and so would fail to perform the task. The solution to this was to make sure that the test created a password that met the requirements and could register the user.

No matter the cause of randomly failing tests, it is critical you get to the bottom of it and ensure that the test passes every time in a predictable way.

7. Testing APIs by calling the API

Integrating your system against an external API is a very common task, if not a main requirement of most projects.

The problem comes when you write tests for your system that use the API in some way. Not only does this create a reliance on the state of the API, but it also means leaking test data into your API.

This creates a mess but means that your tests can start to randomly fail due to data in your API.

For example, you might have a system that pushes user data to an API as the user registers on a site. Your registration code will be looking out for a good response from the API to ensure that the user has been correctly received.

When happens, then, if the API failed to respond in time? Or if you attempt to perform a new registration test on a user that happens to exist in the API? Your tests are going to randomly fail and you'll need to start digging into what happened.

Spending valuable developer time debugging complex API tests like this is not a good use of the project budget, but there is a solution. You should be mocking your API system.

Mocking is a way of wrapping the tests in a dummy object so that you can always be sure that the same response happens for a given request. This means that no requests are ever made outside of the system when you call the API in your tests. A predictable input/output mechanism means that your tests are more predictable and won't randomly fail due to factors beyond your control.

The only downside to this is that if your API changes then you need to update the mocking objects and then ensure that the tests can handle the new structure. You would need to alter tests if the API changes anyway, but with mocking you would need to change the tests and the mocking code. As a result you should probably be abstracting the responses away from your test code (into flat text files for example) so that updating them isn't difficult.

Ultimately, you should never rely on an API being present or in a predictable state when running tests.

What other mistakes have you seen developers make when it comes to writing tests? Comment below and let us know.

Add new comment

The content of this field is kept private and will not be shown publicly.