Test Independence

💻

Just show me the code.

In this article I cover some of the mechanics of how we can improve test independence, while managing the tradeoffs in complexity. Before that, let's recap why we should pursue greater test independence.

Why strive for test independence?

Most of us who inherit or grow projects beyond simple prototypes eventually hit a point where tests start to fail intermittently. It could be sporadic failures or something has changed in the application which exposes a race. When we investigate, we notice the setup of one test has affected another. What follows is a chain of compromises to resolve similar flakiness, all stemming from tests lacking the necessary level of independence.

Engineers should not make those compromises implicitly. Instead, understand and communicate the tradeoffs. The tools in this area are evolving at a steady pace, which is changing the balance of those tradeoffs.

Sludge

Let's identify some of these implicit compromises. Faced with the above scenario, one common lever teams pull is to disable parallelism entirely. We can safely clean up before or after, or re-initialize whatever we need before each test. We only have a few hundred tests so we go from a handful of seconds to a handful more seconds. Perfect.

A few months later some more tests start failing. Okay, we can generate unique IDs for anything we might set up. Awesome.

Some more time passes, now we're covering more complex object graphs and the test failures creep back in. We will wrap every test in a transaction which is rolled-back at the end of every test. Excellent.

As a result of the above, your tests are getting slower at a faster pace since we've dropped the parallelism. Failures therefore have an increasing feedback loop. Delaying lead time, the time it takes for a change to be implemented and deployed in production, while also slowing down deployment frequency. We generate unique IDs for entities but sometimes we still need to wire some entities together predictably, or predictably assert against the absence of related entities. It improved the situation but it didn't solve isolating your tests across particular common scenarios. With the rolled-back transaction, now any transactions inside the code under test are now nested. Subtly changing the behaviour.

None of this is catastrophic. There may be good reasons to accept these trade-offs at any given time. Still, this is an example of sludge. The behaviour we want to encourage in ourselves and our colleagues is to have confidence in our changes. Test at the level which it is easiest to test. Introducing barriers to testing at any layer of the swiss cheese, naturally weakens willingness to write tests at that layer. Even when conceptually it is the most valuable place for those tests.

Integration tests at these kinds of levels covering persistence and above, are necessarily slower than testing smaller units. That doesn't mean the structure of them should be neglected. They can be easy to write, easy to maintain, and we can run lots of them without needing to take a coffee or a lunch break.

How can we improve our test independence?

Isolating tests invariably introduces some level of complexity which we must therefore manage. I mentioned above this is an evolving space. In this section are some of the tools we have today to manage this kind of complexity. Some are dotnet specific but there are typically equivalents in other languages.

AutoFixture

ID and data generation is a powerful tool to help simplify your test setup. AutoFixture is a great example of such a tool. However introducing something like AutoFixture into your test setup can quickly form part of the web tying together your tests. If you share complex setup code between tests, then your tests are not isolated. Even if that setup is re-created between each test. Keep isolation in mind when using a powerful tool like AutoFixture. I would write more about AutoFixture but Mark Seeman has written everything there is to write on this topic.

Testcontainers

Testcontainers lets you define your test dependencies as code. Letting you easily spin up whatever image you need in your tests. For example things like Redis, Postgres, or localstack for local instances of your AWS services.

xUnit3

In a previous article I explained how to use singletons with Testcontainers to spin up shared context like a database server once across a test assembly run. As an example of the ever changing landscape, xUnit v3 recently introduced AssemblyFixtures. Assembly fixtures introduce support for shared context across tests in an assembly. As I'll show below, this makes it even smoother to spin up, and safely dispose of, a database server without losing parallelism.

Putting these all together

To bring these all together we can start with bringing up our database server. In this example I'll use Postgres and we'll hook into xUnit's IAsyncLifetime so it's safely started and disposed of.

public class PostgresAssemblyFixture : IAsyncLifetime
{
    private readonly PostgreSqlContainer _container = new PostgreSqlBuilder("postgres:17-alpine").Build();

    public async ValueTask InitializeAsync()
    {
        await _container.StartAsync();
    }

    public async ValueTask DisposeAsync()
    {
        await _container.StopAsync();
    }

    public string GetConnectionString()
    {
        return _container.GetConnectionString();
    }
}

We can then register our assembly fixture (anywhere in the project usually in a GlobalUsings.cs).

[assembly: AssemblyFixture(typeof(PostgresAssemblyFixture))]

We would like a clean database in each test so we need a fixture which is initialised for each test class. Each initialisation needs to create a fresh database, and run any migrations to get an up to date schema. I used DbUp here as an example. Since my example is covering an API, I've used the WebApplicationFactory which also allows us to easily seed our API with the test database's connection string and create an HttpClient for us to interact with our API.

public class ApiFixture : IAsyncLifetime
{
    private readonly PostgresAssemblyFixture _postgres;
    private WebApplicationFactory<Program>? _factory;

    public ApiFixture(PostgresAssemblyFixture postgres)
    {
        _postgres = postgres;
    }

    public HttpClient Client { get; private set; } = null!;

    public ValueTask InitializeAsync()
    {
        var connectionString = new NpgsqlConnectionStringBuilder(_postgres.GetConnectionString())
        {
            Database = $"api_test_{Guid.NewGuid():N}"
        }.ConnectionString;

        EnsureDatabase.For.PostgresqlDatabase(connectionString);
        DeployChanges.To.PostgresqlDatabase(connectionString)
            .WithScriptsEmbeddedInAssembly(typeof(ApiFixture).Assembly)
            .Build()
            .PerformUpgrade();

        _factory = new WebApplicationFactory<Program>()
            .WithWebHostBuilder(b => b.UseSetting("ConnectionStrings:DefaultConnection", connectionString));

        Client = _factory.CreateClient();

        return ValueTask.CompletedTask;
    }

    public async ValueTask DisposeAsync()
    {
        if (_factory != null)
            await _factory.DisposeAsync();
    }
}

We are now free to write our tests. Each test class will get its own API to test with it's own database. Tests within the same class can still of course clash. However, you only need to manage the context of a single test class. If we have certain routes or features which have some extra complexity, we can confidently isolate that complexity within a relevant class and guarantee we won't be affecting tests in any other class.

public class FruitApiTests(ApiFixture api) : IClassFixture<ApiFixture>
{
    [Theory]
    [AutoData]
    public async Task InsertFruit_ReturnsCreated(Fruit request)
    {
        var response = await api.Client.PostAsJsonAsync("/fruit", request, TestContext.Current.CancellationToken);
        response.StatusCode.ShouldBe(HttpStatusCode.Created);
    }
}

You can find a complete example on github.

Wiremock shout-out

Wiremock is skipped from my example above. However, I wanted to give it a mention, especially the Wiremock Testcontainer modules. Wiremock lets you easily mock external web dependencies. It can be tricky getting a perfectly performant test setup with Wiremock while maintaining test independence. The tradeoff is more severe if you want to isolate your Wiremock calls per class to mirror the isolated test database. There's not a simple way to segregate your Wiremock calls within a single Wiremock server. The options here are lean on unique IDs, prefixing requests, or spinning up a Wiremock server per test class.

The agent in the room

AI agents are great at spitting out tests. Once you have a few examples in place your model of choice will happily cover the cases you're too lazy to write out. The challenge then is from tests coming into our applications at such a rate of knots, that the structure of your tests becomes even more crucial. Not only to avoid the problems outlined above, but also to keep the tests reviewable. Freeing you to focus on reviewing the business logic of test cases, and not wondering whether tweaks to a growing spaghetti soup of setup code will disrupt your pipelines.

Test Independence

Why strive for test independence?

Sludge

How can we improve our test independence?

AutoFixture

Testcontainers

xUnit3

Putting these all together

Wiremock shout-out

The agent in the room

Comments

More from this blog

Singleton Pattern with TestContainers and SQL Server in C#

Exploring Custom Attributes in AutoFixture and xUnit

Command Palette

Why strive for test independence?

Sludge

How can we improve our test independence?

AutoFixture

Testcontainers

xUnit3

Putting these all together

Wiremock shout-out

The agent in the room

Comments

More from this blog