62

Triggered by this thread, I (again) am thinking about finally using unit tests in my projects. A few posters there say something like "Tests are cool, if they are good tests". My question now: What are "good" tests?

In my applications, the main part often is some kind of numerical analysis, depending on large amounts of observed data, and resulting in a fit function that can be used to model this data. I found it especially hard to construct tests for these methods, since the number of possible inputs and results are too large to just test every case, and the methods themselves are often quite longish and can not be easily be refactored without sacrificing performance. I am especially interested in "good" tests for this kind of method.

Jens
  • 729

9 Answers9

53

The Art of Unit Testing has the following to say about unit tests:

A unit test should have the following properties:

  • It should be automated and repeatable.
  • It should be easy to implement.
  • Once it’s written, it should remain for future use.
  • Anyone should be able to run it.
  • It should run at the push of a button.
  • It should run quickly.

and then later adds it should be fully automated, trustworthy, readable, and maintainable.

I would strongly recommend reading this book if you haven't already.

In my opinion, all these are very important, but the last three (trustworthy, readable, and maintainable) especially, as if your tests have these three properties then your code usually has them as well.

Andy Lowry
  • 2,422
44

A good unit test doesn't mirror the function it is testing.

As a greatly simplified example, consider you have a function that returns an average of two int's. The most comprehensive test would call the function and check if a result is in fact an average. This doesn't make any sense at all: you are mirroring (replicating) the functionality you are testing. If you made a mistake in the main function, you will make the same mistake in the test.

In other words, if you find yourself replicating the main functionality in the unit test, it's a likely sign that you are wasting your time.

mojuba
  • 5,713
10

Good unit tests is essentially the specification in runnable form:

  1. describe the behavior of the code corresponding to use cases
  2. cover technical corner cases (what happens if null is passed) - if a test is not present for a corner case, the behavior is undefined.
  3. break if the code tested change away from the specification

I have found Test-Driven-Development to be very well suited for library routines as you essentially write the API first, and THEN the actual implementation.

7

for TDD, "good" tests test features that the customer wants; features do not necessarily correspond to functions, and test scenarios should not be created by the developer in a vacuum

in your case - i'm guessing - the 'feature' is that the fit function models the input data within a certain error tolerance. Since I have no idea what you're really doing, I'm making something up; hopefully it is analgous.

Example story:

As a [X-Wing Pilot] I want [no more than 0.0001% fit error] so that [the targeting computer can hit the Death Star's exhaust port when moving at full speed through a box canyon]

So you go talk to the pilots (and to the targeting computer, if sentient). First you talk about what is 'normal', then talk about the abnormal. You find out what really matters in this scenario, what is common, what is unlikely, and what is merely possible.

Let's say that normally you'll have a half-second window over seven channels of telemetry data: speed, pitch, roll, yaw, target vector, target size, and target velocity, and that these values will be constant or changing linearly. Abnormally you may have less channels and/or the values may be changing rapidly. So together you come up with some tests such as:

//Scenario 1 - can you hit the side of a barn?
Given:
    all 7 channels with no dropouts for the full half-second window,
When:
    speed is zero
    and target velocity is zero
    and all other values are constant,
Then:
    the error coefficient must be zero

//Scenario 2 - can you hit a turtle?
Given:
    all 7 channels with no dropouts for the full half-second window,
When:
    speed is zero
    and target velocity is less than c
    and all other values are constant,
Then:
    the error coefficient must be less than 0.0000000001/ns

...

//Scenario 42 - death blossom
Given:
    all 7 channels with 30% dropout and a 0.05 second sampling window
When:
    speed is zero
    and position is within enemy cluster
    and all targets are stationary
Then:
    the error coefficient must be less than 0.000001/ns for each target

Now, you may have noticed that there's no scenario for the particular situation described in the story. It turns out, after after talking with the customer and other stakeholders, that goal in the original story was just a hypothetical example. The real tests came out of the ensuing discussion. This can happen. The story should be rewritten, but it doesn't have to be [since the story is just a placeholder for a conversation with the customer].

5

Create tests for corner cases, like an test set containing only the minimum number of inputs (possible 1 or 0) and a few standard cases. Those unit tests are not a replacement for thorough acceptance tests, nor should they be.

user281377
  • 28,434
5

I've seen lots of cases where people invest a tremendous amount of effort writing tests for code that is seldom entered, and not writing tests for code that is entered frequently.

Before sitting down to write any tests, you should be looking at some kind of a call graph, to make sure you plan adequate coverage.

Additionally, I don't believe in writing tests just for the sake of saying "Yeah, we test that". If I'm using a library that is dropped in and will remain immutable, I'm not going to waste a day writing tests to make sure the innards of an API that will never change works as expected, even if certain parts of it score high on a call graph. Tests that consume said library (my own code) point this out.

4

Not quite so TDD, but after you have gone into QA you can improve your tests by setting up test cases to reproduce any bugs that come up during the QA process. This can be particularly valuable when you're going into longer term support and you start getting to a place where you risk people inadvertantly reintroducing old bugs. Having a test in place to capture that is particularly valuable.

glenatron
  • 8,689
3

I try to have every test only test one thing. I try to give each test a name like shouldDoSomething(). I try to test behaviour, not implementation. I only test public methods.

I usually have one or a few tests for success, and then maybe a handfull of tests for failure, per public method.

I use mock-ups a lot. A good mock-framework would probably be quite helpfull, such as PowerMock. Although I'm not using any yet.

If class A uses another class B, I'd add an interface, X, so that A doesn't use B directly. Then I'd create mock-up XMockup and use it instead of B in my tests. It really helps speeding up test execution, reducing test complexity, and also reduces the number of tests I write for A since I don't have to cope with the peculiarities of B. I can for example test that A calls X.someMethod() instead of a side effect of calling B.someMethod().

Keep you test code clean as well.

When using an API, such as a database layer, I'd mock it and enable the mock-up to throw an exception at every possible opportunity on command. I then run the tests one without throwing, and the in a loop, each time throwing an exception at the next opportunity until the test suceeds again. A bit like the memory tests available for Symbian.

2

I see that Andry Lowry has already posted Roy Osherove's unit test metrics; but it seems no one has presented the (complimentary) set that Uncle Bob gives in Clean Code (132-133). He uses the acronym FIRST (here with my summaries):

  • Fast (they should run quickly, so people won't mind runing them)
  • Independent (tests should not do setup or teardown for one another)
  • Repeatable (should run on all environments/platforms)
  • Self-validating (fully automated; the output should be either "pass" or "fail", not a log file)
  • Timely (when to write them—just before writing the production code they test)
Kazark
  • 1,820