13

While there are ways of keeping unit tests from being executed, what is the value of checking in failing unit tests?

I will use a simple example: Case Sensitivity. The current code is case sensitive. A valid input into the method is "Cat" and it would return an enum of Animal.Cat. However, the desired functionality of the method should not be case sensitive. So if the method described was passed "cat" it could possibly return something like Animal.Null instead of Animal.Cat and the unit test would fail. Though a simple code change would make this work, a more complex issue may take weeks to fix, but identifying the bug with a unit test could be a less complex task.

The application currently being analyzed has 4 years of code that "works". However, recent discussions regarding unit tests have found flaws in the code. Some just need explicit implementation documentation (ex. case sensitive or not), or code that does not execute the bug based on how it is currently called. But unit tests can be created executing specific scenarios that will cause the bug to be seen and are valid inputs.

What is the value of checking in unit tests that exercise the bug until someone can get around to fixing the code?

Should this unit test be flagged with ignore, priority, category etc, to determine whether a build was successful based on tests executed? Eventually the unit test should be created to execute the code once someone fixes it.

On one hand it shows that identified bugs have not been fixed. On the other, there could be hundreds of failed unit tests showing up in the logs and weeding through the ones that should fail vs. failures due to a code check-in would be difficult to find.

Jim G.
  • 8,035

8 Answers8

19

I do not like broken unittests checked in because they produce unnecessary noise. After every unittest i would have to check all failed issues(red). Is it red because there is a new problem or because there is an old open to do/to fix. This is not okay if there are more than 20 unittests.

Instead I use

  • [Ignore("Reason")] attribute that makes the result yellow or
  • throw new NotImplementedException() that makes the result grey

Note: I am using NUnit for .net. I am not sure if the "gray" feature is there in other unittest frameworks.

So I like the following meaning of unit test-results.

  • green : all finished
  • gray : planned new features that need to be done but with low priority
  • yellow : bugs not fixed yet. Should be fixed soon
  • red : new bugs. Should be fixed immediately

Everything except "red" can be checked in.

To answer the question: there is more harm than value to check in "red-failed-testes" but checking in "yellow-ignored-tests" or "grey-NotImplementedYet-tests" can be usefull as a todo-list.

k3b
  • 7,621
10

I won't pretend that this is industry standard, but I check in broken tests as a way of reminding me or my other project members that there is still a problem with either the code or the unit test itself.

I suppose one thing to consider is whether your development policies allow for failed tests without penalty. I have a friend who works at a shop that does test-driven development, so they always start out with failing tests...

Tieson T.
  • 374
7

The purpose of unit tests is to assert the expected behavior of a system, not to document defects. If we use unit tests to document defects, then the usefulness of them to assert expected behavior is lessened. The answer to the question "Why did this test fail?" is not a simple "Oh, something is broken that I didn't expect to be broken." It has become unknown whether the test failure is expected or unexpected.

Here is a paragraph from the beginning of chapter 13 of Working Effectively with Legacy Code:

Automated unit tests are a very important tool, but not for bug finding--not directly, at least. In general, automated tests should specify a goal that we'd like to fulfill or attempt to preserve behavior that is already there. In the natural flow of development, tests that specify become tests that preserve. You will find bugs, but usually not the first time that a test is run. You find bugs in later runs when you change behavior that you didn't expect to.

6

Failing unit tests give the development team visibility into what must be done to conform to the agreed upon specifications.

In short, failing unit tests give the team a "TODO" list.

For this reason, failing unit tests are far better than no unit tests at all.*
The absence of unit tests leaves the development team in the dark; specifications must be repeatedly confirmed manually.

[*Provided that the unit tests actually test something useful.]

Jim G.
  • 8,035
3

But the broken ones that identify bugs in a new project, named as such. That way you can see that they SHOULD break... As they are being fixed, they would be turned green, and moved into the normal test suite.

NOTE: That project would have to be set to not be built on the build server, if your build server prevents checkins that break the build (assuming you define a broken build as one in which all tests don't pass)

CaffGeek
  • 8,103
2

Unit tests should test for error cases in addition to success cases of a function. A function should explicitly reject bad input or should have documentation to explain what input is considered valid.

If you have a function that is not doing either of those things, it is a bug and you should have a way of recording that it exists. Creating a unit test that demonstrates this issue is one way to do that. Filing a bug ticket is another option.

The point of unit testing is not to have 100% success, the point is to find and fix bugs in the code. Not making tests is an easy way to have 100% success, but that is not very beneficial to the project.

1

File a bug for every failure, and note that in the test result. Then if you ever get your act together and fix the bug, your test passes and you delete that from the test result. Never just ignore problems.

-3

How I see TDD done with implementing tests for an unfinished code is, write tests first with [ExpectedException] attribute or similar. This should pass initially as the incomplete code would not have any logic in it and write a throw new Exception() code in it. Though throughing exception is wrong, this would atleast make the tests pass initially and fit for checkin. We may forget an ignored test but may definitely tidy up or fill the incomplete code. When we do so, automatically the corresponding test that was expecting an excpetion would now start failing and alarm you to fix it. This may involve a slight alteration of test to get rid of ExpectException and instead do real asserts. CI, Dev, testers and customers all happy and a win-win situation?