Is cyclomatic complexity density a good software quality metric?

Question

Where I define Cyclomatic complexity density as:

Cyclomatic complexity density = Cyclomatic complexity / Lines of code

I was reading previous discussions about cyclomatic complexity and there seems to be a sort of consensus that it has mixed usefulness, and as such there probably isn't a strong motive for using it over a simple lines of code (LOC) metric. I.e. As the size of a class or method increases beyond some threshold the probability of defects and of there having been poor design choices goes up.

It seems to me that cyclomatic complexity (CC) and LOC will tend to be correlated, hence the argument for use of the simpler LOC metric. However, there may be outlier cases where complexity is higher within some region of code, i.e. there is a higher density of execution branches in some pieces code (compared to the average) and I'm wondering if that will tend to be correlated with the presence of defects.

Is there any evidence for or against, or are there any experiences around the use of such a complexity density metric?

Or perhaps a better metric is to have both a LOC and a CC threshold, and we consider passing either threshold as bad.

Sjoerd · Answer 1 · 2015-07-22T10:54:28.417

There are no software quality metrics that are good - at least none are known yet. Years of research hasn't provided us with any good one yet.

So the answer whether any of your suggested metrics is a good metric for software quality is a disappointing "no."

There are some metrics that are reasonable indicators of bad software. But the lack of signs of bad software doesn't make software good. Besides, those indicators are very fussy with lots of exceptions, so automatic refusal of bad code is impractical in practice.

Researchers debate those metrics and correlate the metrics with bugs, but usually those correlations are no stronger than "larger programs contain more bugs." As this entry on Wikipedia admits:

The essence of this observation is that larger programs (more complex programs as defined by McCabe's metric) tend to have more defects. Although this relation is probably true, it isn't commercially useful.

As a result, those metrics are barely used in the commercial world - they don't save time nor money.

This should not stop researchers from looking for better ones. But so far, in my opinion it's a theoretical exercise with barely any practical usage.

During my 20 years of working in the commercial ICT, I've encountered two metrics used in practice, and neither can be automated:

"Number of paying customers satisfied," also known as "How much does it earn us?"
"WTFs per minute" when peers are reading your code.

Which is why this image is popular - I've seen it in multiple shops: WTF per minute

score 6 · Answer 2 · answered Jul 22 '15 at 02:30

Consider the following:

define dispatch_message (message_id, message_contents)
    if (message_id == MESSAGE_ID_1)
        FirstMessageId(message_contents).dispatch()
    else if (message_id == MESSAGE_ID_2)
        SecondMessageId(message_contents).dispatch()
    ...
    else if (message_id == MESSAGE_ID_N)
        NthMessageId(message_contents).dispatch()
    else
        raise_or_throw_an_error
    end if
end function

This function comprises 2N+5 lines of code and has a cyclomatic complexity of N+1 or N+3, depending on how you count raise_or_throw_an_error. Suppose N is 200. The complexity density is around 1/2. What does that mean? On the other hand, a function that has a line count of 4005 and a complexity of over 2000 means something.

Despite having a complexity of 2000+, my function isn't completely terrible. (Okay, it's terrible; there are more modern ways to do this.) This is surprisingly a fairly common construct in very high reliability systems. Done right, it's fairly obvious what the function is doing.

One problem with dividing SLOCs by complexity is that there's a strong correlation between SLOCs and complexity. My function is an anomaly. All that your metric will show is that my function is anomalous. At the other extreme, consider a very long auto-generated function with a cyclomatic complexity of one. These also are not that problematic in and of themselves. (It's the generator you need to look at, not the 40000 line long function.)

The two extremes of the SLOC to complexity ratio aren't the places to look for bugs. It's somewhere in the middle. Unfortunately, that's where you'll find most of your functions. SLOC and complexity give false alarms, but those false alarms are worth investigating. I don't see the same applying to your SLOC to complexity ratio metric. The buggiest of functions will most likely be hidden amongst a huge number of non-alarming functions that have a similar SLOC to complexity ratio.

Is cyclomatic complexity density a good software quality metric?

2 Answers2