Should on each test create and nuke a testing database?

Question

In my case I inherited a poorly engineered code, on that piece of code I have been tasked to increase the code coverage in integration tests. But instead of the usual pattern:

Create/Populate a test database with specific test data
run the test
Delete the test data or Nuke the db
Repeat step 1 until no more tests.

The project itself is a laravel API that some logic implemented originally in codeigniter has been poorly migrated (Lots of time I came across, some logic in MVC controllers). Also not any sort of migration tests has been implemented as well.

In this codebase I have been tasked to increase the code coverage of integration tests. I am the exra member of a team of 2 people and on the existing tests I noticed that is relied upon existing data and due to lack migration scripts the workflow above is not followed.

As a result there is no consistency of the test results of the existing database integration tests and some tests depending the test execution sequence either pass or fail.

Also the database has been left as is with the very same schema that the codeingiter uses, also the code has not been fully migrated in laravel from codeigniter and as a result I inherited some mess. Not to mention that the migration scripts in laravel do not fully cover the whole database.

So I wonder:

What's the point of having integration tests if we have not the right tools, (creating on ther fly a test db)?
Should I spent a time to create a way to create on the fly a database and refactor all tests to create on the fly a test database from existing snapshot schema?
Should I gradually do gradually small scale redesighns (without telling to anyone) whilst I implement the tests, if yes what procedure I should follow?

score 9 · Answer 1 · edited Jul 02 '19 at 16:05

Tests should be independent of each other and reproduceble.

This can be done with

complete database setup for every test as you described
or with a predefined database-content where a database-commit is not allowed and where all changes are rolled-back in the end (so database is not changed)
or where the repository-database-implementation is replaced by by mocks/fakes.

If you want to use "predefined database-content" you should have a test-database-setup script so the database can be easily setup and loaded on a developper-database engine

Laiv · Accepted Answer · 2019-07-04T08:02:55.707

In addition to k3b's answer.

If you want to use "predefined database-content" you should have a test-database-setup script so the database can be easily set up and loaded on a developer-database engine.

It's not that simple. You can not simply rely on an existing DB that could come and go, change any time due to unknown reasons and by the hand of different actors¹.

No, you should deploy your own DB. Once per test or once per the whole suite is up to you. This is important if you want your tests to be deterministic and get the most similar behaviour to the one expected from production. It's important for CI too. The builds might run in dedicated environments that might not have access to the DB. Think in running builds on the cloud. In my opinion, these tests should behave like unit tests as well: run any time in any environment.

You must consider matching the DB engine in vendor and version too, for more liable outputs.²

However, there's the problem of timings. Deploying DBs slows down test executions. Time is a limited and valuable resource you don't want to waste. In consequence, you have to balance integration tests with other kinds of tests.

So, when I would run on-the-fly DB instead of mocks or stubs?

When CI and CD timings are not constraining or critical.
When I need accuracy and precision in configurations and set-ups, overall system behaviour, approximate performance, etc.
When there's several teams (or devs) working on the same application, feature, task, etc.
When running test in distributed environments.
When I want to keep test code complexity at bay ³
When the benefits offset the costs by much.

How to shift from one way to another depends on the context, the team and the resources at hand. I would not start small scale redesigns without letting others know, because the change on the paradigm is fairly important as for the others to know and embrace the idea as soon as possible.

^{1: Say, there're more teams running tests on the same DB. Tests running for a different version of the same application.}

^{2: You might think that deploying fake or lightweight DB could do the job, but to my experience, they never behave exactly like the product they try to behalf. In some cases, I got unexpected and unpleasant behaviours in production that I could not detect during tests.}

^{3: Introducing mocks increments the complexity of the test code. It's also a possible source for bugs because usually we never test the test code. We accidentally could set the wrong behaviour to the mock, the reason why mocking 3rd party components is not advisable.}

Should on each test create and nuke a testing database?

2 Answers2