How can I define and measure simplicity in code?

Question

There are many answers in my previous question about simplicity relating to readability that helped me see my definition and understanding of simplicity in code was, quite possibly, incorrect.

How can I define simplicity in code? What software measurements and metrics are available to measure code simplicity?

score 16 · Accepted Answer · answered Dec 06 '11 at 15:59

The most common metrics for measuring the complexity (or simplicity, if you take simplicity to be the opposite of complexity) are McCabe's Cyclomatic Complexity and the Halstead Complexity Metrics.

Cyclomatic complexity measures the number of distinct paths through a given unit, usually a method or function, although it can also be computed on a class. As the number of paths increase, it becomes more difficult to remember the flow of data through a given module, which is related to the concept of working memory. High cyclomatic complexity tends to indicate difficulty in the ability to test a module - more test cases are required to cover the various paths through the system. There have also been studies that have linked high cyclomatic complexity to high defect rates. Typically, a cyclomatic complexity of 10 indicates that a unit should be reviewed and possibly refactored.

The Halstead complexity measures use the inputs of total and distinct operators and operands to compute the volume, difficulty, and effort of a piece of code. Difficulty, which is the (number of unique operators / 2) * (total number of operands / number of unique operands), is tied to the ability to read and understand the code for tasks such as learning the system or performing a code review. Again, you can count this on a system level, a class level, or a method/function level. There are a few postings about computing these measurements here and here.

Simply counting lines of code can also give you an idea of complexity. More lines of code means that there is more to read and understand in a module. I would be hesitant to use this as a stand-alone measurement. Instead, I'd use it with other measurements, such as number of defects in a given module to obtain defect density. A high defect density could indicate problems in writing tests and performing code reviews, which may or may not be caused by complex code.

Fan-in and fan-out are two other metrics, related to the flow of data. As defined here, fan in is the sum of the procedures called, parameters read, and global variables read and fan out is the sum of procedures that call a given procedure, parameters written to (exposed to outside users, passed in by reference), and global variables written to. Again, high fan-in and fan-out might be indicative of a module that might be difficult to understand.

In specific paradigms, there might be other measures or metrics that are also useful. For example, in the object-oriented world, monitoring coupling (desire low), cohesion (desire high), and depth of inheritance (desire low) can be used to assess how simple or complicated a system is.

Of course, it's important to realize that a lot of measures and metrics are simply indicators. You need to use your judgement to determine if it's necessary to refactor to increase simplicity or if it's not worth the effort to do so. You can make the measurements, compute the metrics, and learn about your code, but you don't want to design your system by the numbers. Ultimately, do what makes sense.

Dipan Mehta · Answer 2 · 2011-12-07T05:10:14.290

Instead of looking at a formal mode of defining simplicity, i would rather like to define simplicity as an attribute of quality of code writing.

I am not putting some measure of simplicity but when do you call something simple or not.

1. Code Traversal:
How easy it is to navigate through the code? Is it easy to spot where the API functions are written? Is it easy to understand call flows, for example which methods are calling others (and why)- are there good state machines implemented or cleanly identified algorithms?

When the code traversal is easy, the code is simple to follow.

2. Naming
While other codding standards help make code look cleaner - the most important thing is the naming of classes/object-instances/Variables/methods. The use clear and unambiguous names is clearly has a great impact on the Simplicity of the code. When it is difficult to identify a simple name, it is a sign that you might want to re-think the idea being that variable/method.

3. Interpretation and references
Does each of your method has a clear role to play. Does each variables/attributes are easy to determine the role they are playing? When a piece of code does something which has implies assumptions or affects unrelated set of variables, can become a maintenance nightmare.

4. Dependency or coupling
This is difficult to judge just by looking at the code, but becomes very evident if someone tries to fix your bugs. When some other things change in some other object, does the operation here changes? Are those changes obvious? Do you require to change the API so often to accommodate stuff. These suggest that intermodule relationships is not simple

5. Inputs User or Applications
Finally how simple are the user inputs or application are accepted on the API/UI? When multiple possible Users/Applications (for different purposes) needs to give you - are they obvious? Are there states/details that are not related to the higher abstraction but still goes back-n-forth the interface?

A simple question i would generally ask is as follow: If instead of a program, if i would have asked the same function to be performed by a human, would i have filled this information on a paper form? If not, i am not simple enough here.

I won't say this list is exhaustive, but i but i guess criteria is how easy or difficult it is to use and modify the software. That is simple.

FrustratedWithFormsDesigner · Answer 3 · 2011-12-06T16:37:07.743

I am not aware of any good existing metrics for code simplicity (it doesn't mean they don't exist - just that I don't know about them). I could propose some, maybe some will help:

Simplicity of language features used: if the language has features that might be considered "advanced" and "simple" you could count the number of occurrences of the advanced features. How you define "advanced" might be a little more subjective. I supposes some might say this is also like measuring the "cleverness" of a program. A common example: some might say that the ?: operator should be an "advanced" feature, others might disagree. I don't know how easy it would be to write a tool that can test for this.
Simplicity of constructs within the program: You could measure the number of paramters a function will accept. If you have > n % of all functions with > m parameters, you could choose to count it as not simple, depending on how you define n and m (maybe n=3 and m=6?). I think there are some static analysis tools that can measure this - I think JTest simply measured functions with > m parameters.
You could try to count the number of nested loops or control structures. This I think is actually not a bad metric and I think there's a name for it (can't recall off the top of my head). Again, I think there are tools (again, like JTest) that can measure this, to a degree.
You could try to measure "refactorability". If your code contains lots of pieces of code that could be refactored but aren't, maybe that would could as not simple. I also recall from the time I worked with JTest that it tried to measure this too, but I remember I didn't often agree with it in this case, so YMMV.
You could try to measure the number of layers between different parts of your system. For example: how many different pieces of code will touch data that comes from a web form before it gets stored in the database? This could be a tricky one to measure properly...

How can I define and measure simplicity in code?

3 Answers3