Do we need test data or can we rely on unit tests and manual testing?

Question

We're currently working on a medium/large PHP/MySQL project. We're doing unit testing with PHPUnit & QUnit and we have two full time testers that are manually testing the application. Our test (mock) data is currently created with SQL scripts.

We have problem with maintaining scripts for test data. The business logic is pretty complex and one "simple" change in the test data often produces several bugs in the application (which are not real bugs, just the product of invalid data). This has become big burden to the whole team because we are constantly creating and changing tables.

I don't really see the point of maintaining the test data in the scripts because everything can be manually added in the application in about 5 minutes with the UI. Our PM disagrees and says that having project that we can’t deploy with test data is a bad practice.

Should we abandon maintenance of the scripts with test data and just let the testers to test the application without data? What’s the best practice?

P.Brian.Mackey · Answer 1 · 2011-10-10T14:27:20.717

Yes, having unit tests and data mock-ups is a best practice. The project manager is correct. Since performing a "simple" change in the test data often produces bugs, then that is the core of the problem.

The code needs improvement. Not doing so (saying hey we don't need tests) is not a fix, that is simply adding technical debt. Break the code down into smaller more test-able units because being unable to identify units without breakage is a problem.

Start doing a refactor. Keep the improvements small so they are manageable. Look for anti-patterns like God classes/methods, not following DRY, single-responsibility, etc...

Finally, look into TDD to see if it works for the team. TDD works well for ensuring all your code is test-able (because you write the tests first) and also ensuring you stay lean by writing just enough code to pass the tests (minimize over engineering).

In general, if a series of complex business logic processes produce a set of data, then I view this as a report. Encapsulate the report. Run the report and use the resultant object as input to the next test.

AJC · Accepted Answer · 2011-10-10T15:12:12.557

You are mixing two different concepts. One is verification, which is based on Unit Testing and Peer Reviews. This can be done by the developers themselves, without test data and its intent is to verify that a set of requirements are met.

The second one is validation, and this is done by QA (your testers). For this step you do need test data since the tester do not need to have any knowledge of the programming in the application, only its intended use cases. Its objective is to validate that the application behaves as intended in a production environment.

Both processes are important and necessary to deliver a quality product to the customer. You can't rely on unit tests alone. What you need to figure out is a reliable way to handle your test data to ensure its valid.

EDIT: OK, I get what you are asking. The answer is yes, because the Tester's job is not to generate the test data, just to test the application. You need to build your scripts in a way that allows easier maintenance and ensures valid data is inserted. Without the test data, tester will have nothing to test. Having said that, however, if you have access to the testing environment, I don't see why you can't you insert the test data manually rather than by using scripts.

score 1 · Answer 3 · answered Oct 10 '11 at 19:10

This is a very common problem and very difficult one as well. Automated tests that run against a databse (even an in-memory database, such as HSQLDB) are usually slow, non-deterministic and, since a test failure only indicates that there is a problem somewhere in your code or in you data, they are not much informative.

In my experience, the best strategy is to focus on unit tests for business logic. Try to cover as much as possible of your core domain code. If you get this part right, which is itself quite a challenge, you will be achieving the best cost-benefit relationship for automated tests. As for the persistence layer, I normally invest much less effort on automated tests and leave it to dedicated manual testers.

But if you really want (or need) to automate persistence tests, I would recommend you to read Growing Object-Oriented Software, Guided by Tests. This book has a whole chapter dedicated to persistence tests.

score 0 · Answer 4 · answered Jul 23 '24 at 17:47

The behavior of your application depends on the inputs and on the stored data -- which you might regard as invisible, implicit input to the application.

How long would it take you to script the data set-up steps that take 5 minutes manually? How fast would that script run?

I recommend scripting the data set-up, being careful to do so in an idempotent way, so that no matter the starting state of the data set-up script, the ending state is always "the test data is as needed". You can then include a call to the data set-up in every test script.

See https://docs.pytest.org/en/stable/explanation/fixtures.html for one codification of this idea in a testing library; there are many others.

score 0 · Answer 5 · answered May 02 '25 at 23:03

0

Data stored in an old format is critical for migration testing. Most applications would be useless if user data were lost on each update, so testing should include scenarios using imitation of old user data, and not just created anew.

answered May 02 '25 at 23:03

Basilevs

3,896

Do we need test data or can we rely on unit tests and manual testing?

5 Answers5