126

You've found some code that looks superfluous, and the compiler doesn't notice that. What do you do to be sure (or as close to sure as you can) that deleting this code won't cause regression.

Two ideas spring to mind.

  1. "Simply" use deduction based on whether or not the code looks like it should execute. However, sometimes this can be a complex, time-consuming job, slightly risky (your assessment is prone to error) for no substantial business return.

  2. Place logging in that code section and see how often it gets entered in practice. After enough executions, you should have reasonable confidence removing the code is safe.

Are there any better ideas or something like a canonical approach?

20 Answers20

109

In my perfect fantasy world where I have 100% unit test coverage I would just delete it, run my unit tests, and when no test turns red, I commit it.

But unfortunately I have to wake up every morning and face the harsh reality where lots of code either has no unit tests or when they are there can not be trusted to really cover every possible edge case. So I would consider the risk/reward and come to the conclusion that it is simply not worth it:

  • Reward: Code is a bit easier to maintain in the future.
  • Risk: Break the code in some obscure edge case I wasn't thinking about, cause an incident when I least expect it, and be at fault for it because I had to satisfy my code quality OCD and make a change with no business value perceivable to any stakeholder who isn't a coder themselves.
Philipp
  • 23,488
86

There are two halves to this process. The first is confirming that the code is indeed dead. The second is comprehending the costs of being wrong and making sure they are properly mitigated.

Many answers here have great solutions to the former half. Tools like static analyzers are great for identifying dead code. grep can be your friend in some cases. One unusual step I often take is to try to identify what the code’s original purpose was. It’s a lot easier to argue “X is no longer a feature of our product, and code segment Y was designed to support feature X” than it is to say “I don’t see any purpose for code segment Y.”

The second half is a key step to breaking any gridlock over whether you should remove code. You need to understand what the implications are of getting the answer wrong. If people are going to die if you get the answer wrong, pay attention! Maybe it’s a good time to accept that code cruft develops over time and instead try not to write more cruft yourself. If people aren’t going to die, ask yourself how forgiving your users are. Can you send them a hotfix if you broke something and maintain your customer relations? Do you have a Q&A team that’s paid to find issues like this? These sorts of questions are essential for understanding how certain you must be before you hit the delete key.

In the comments, rmun pointed out an excellent phrasing of the concept of understanding the original purpose of the code before removing it. The quote is now known as Chesterton’s Fence. While it is too large to be quoted directly in a comment, I think it deserves to be properly quoted here:

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.

SusanW
  • 1,034
Cort Ammon
  • 11,917
  • 3
  • 26
  • 35
42

I also tend to grep for the function/class name in the code, which can give some additional benefits that a code analyzer might not, like if the name is mentioned in a comment or in a documentation file, or a script, for example. I run grep on the files in the source tree and store the result in a file; usually the result gives a condensed information: file name/path, line number, and the line where the name is encountered, which can give clues to where the function/class is called or mentioned without any semantic meaning (in contrast to a code analyzer), and regardless of the file extensions. Definitely not the ultimate solution but a nice addition to an analysis.

piwi
  • 486
29
  • Identify the code that looks dead (static analysis, etc).
  • Add a log message for every invocation of the allegedly dead code. It's easy with functions / methods; with static members like constants it's trickier. Sometimes you can just mark code as deprecated, and the runtime will log a message for you automatically. Leave the code otherwise intact.
    • Log a message when the dead module is loaded: most languages have a way to run static initialization at load time, which can be piggy-backed.
    • Make sure your log message includes a reasonable stack trace, so that you understand what called the allegedly dead code.
  • Run the altered code through all your test suites. Your test suites should also test for special times, like crossing a midnight, a turn of a quarter, a turn of a year, etc. Look at the logs, update your understanding of what's dead. Note that unit tests may specifically test the dead code, while no other unit tests and no integration tests touch it.
  • Run the altered code in production for a few weeks. Make sure every periodic process, like those once-a-month ETL cron jobs, are run during this period.
  • Look at the logs. Anything that was logged is not actually dead. The transitive closure of the call graph over the rest of the dead code is also potentially not dead, even if it was not invoked. Analyze it. Maybe some branches are safely dead (e.g. working with an API of a now-extinct service), while some are not yet. Maybe a whole module / class is only loaded for a constant defined in it, and you can easily disuse it.
  • Whatever is left is safely dead, and can be removed.
9000
  • 24,342
17

In addition to the existing answers mentioned you could also iteratively remove the code over several versions.

In the initial version you could do a deprecation warning with the code block still functioning. In the version after that you could remove the code block but leave an error message letting users the feature is deprecated and no longer available. In final version you would remove the code block and any messages.

This may help identify unforeseen functionality with no warning to end users. In the best case scenario the code really does nothing and all that happens is that the unneeded code is kept over several versions before being removed.

7

A lot of people are suggesting that the "safe" thing to do is to leave the code in if you can't prove it's unused.

But code is not an asset, it's a liability.

Unless you can explain why it's important and point to its tests, the "safer" alternative might very well be to delete it.

If you're still not sure, then at least make sure to add tests to exercise the zombie code.

6

You can use feature toggles to change the execution path of your software to completely ignore the code in question.

This way you can safely deploy your change with the code not in use, and the toggle off. If you notice some major faults related to the code turn the toggle back on and investigate the possible path to it.

This approach should give you confidence if you see no problems over prolonged period of time as well as the ability to turn it back on live w/o a deployment. However an even better way would be to also apply extra logging and test coverage around the area in question which will provide more evidence whether it is used or not.

Read more about toggles here: https://martinfowler.com/articles/feature-toggles.html

Sash
  • 165
6

Static analysis of course... and the nice thing is, you don't need any new tools; your compiler has all the static analysis tools you need.

Just change the name of the method (e.g. change DoSomething to DoSomethingX) and then run a build.

If your build succeeds, the method, obviously, isn't being used by anything. Safe to delete.

If your build fails, isolate the line of code that calls it and check to see what if statements surround the call to determine how to trigger it. If you can't find any possible data use case that triggers it, you can safely delete it.

If you are really worried about deleting it, consider keeping the code but marking it with a ObsoleteAttribute (or the equivalent in your language). Release it like that for one version, then remove the code after no problems have been found in the wild.

John Wu
  • 26,955
4
  • I would start using a code analyzer to check whether this is dead code
  • I would start checking my tests and trying to reach the code. If I am not able to reach the code within a test case it might be dead code
  • I would remove the code and throw an exception instead. For one release the exception is activated, and in the second release the exception can be removed. To be safe, place an emergency flag (in Java, a system property) to be able to activate the original code, when a customer notices the exception. So the exception can be disabled and the original code can be activated in a production environment.
3

Removing unreachable code

In a principled statically typed language, you should always know whether the code is actually reachable or not: remove it, compile, if there is no error it was not reachable.

Unfortunately, not all languages are statically typed, and not all statically typed languages are principled. Things that could go wrong include (1) reflection and (2) unprincipled overloading.

If you use a dynamic language, or a language with sufficiently powerful reflection that the piece of code under scrutiny could potentially be accessed at run-time via reflection, then you cannot rely on the compiler. Such languages include Python, Ruby or Java.

If you use a language with unprincipled overloading, then merely removing an overload could simply switch the overload resolution to another overload silently. Some such languages allow you to program a compile-time warning/error associated with the usage of the code, otherwise you cannot rely on the compiler. Such languages include Java (use @Deprecated) or C++ (use [[deprecated]] or = delete).

So, unless you are very lucky to work with strict languages (Rust comes to mind), you may really be shooting yourself in the foot by trusting the compiler. And unfortunately test suites are generally incomplete so of not much more help either.

Cue the next section...


Removing potentially unused code

More likely, the code is actually referenced, however you suspect that in practice the branches of code that reference it are never taken.

In this case, no matter the language, the code is demonstrably reachable, and only run-time instrumentation can be used.

In the past, I've successfully used a 3-phases approach to removing such code:

  1. On each branch suspected NOT to be taken, log a warning.
  2. After one cycle, throw an exception/return an error upon entering the specific piece of code.
  3. After another cycle, delete the code.

What's a cycle? It's the cycle of usage of the code. For example, for a financial application I would expect a short monthly cycle (with salaries being paid at the end of the month) and a long yearly cycle. In this case, you have to wait at least a year to verify that no warning is ever emitted for the end-of-year inventory could use code paths that are otherwise never used.

Hopefully, most applications have shorter cycles.

I advise putting a TODO comment, with a date, advising on when to move on to the next step. And a reminder in your calendar.

Matthieu M.
  • 15,214
2

Removing code from production is like cleaning house. The moment you throw away an object from the attic, your wife will kill you the next day for throwing away her great grand mother's third nephew's homecoming gift from their neighbor who died in 1923.

Seriously, after cursory analysis using various tools that everyone has already mentioned, and after using the stepped deprecation approach already mentioned, keep in mind that you may be gone from the company when the actual removal takes place. Letting the code remain and have its execution logged and ensured that any alerts of its execution are definitely communicated to you (or your successors and assigns) is essential.

If you don't follow this approach, you're likely to be killed by your wife. If you keep the code (like every keepsake) then you can rest your conscience in the fact that Moore's law comes to your rescue and the cost of junk disk space used by the code that feels like a "rock in my shoe", reduces every year and you don't risk becoming the center of multiple water cooler gossips and weird looks in the hallways.

PS: To clarify in response to a comment.. The original question is "How do you safely delete.." of course, Version Control is assumed in my answer. Just like digging in the trash to find the valuable is assumed. No body with any sense throws away code and version control should be part of every developer's rigor.

The problem is about code that may be superfluous. We can never know a piece of code is superfluous unless we can guarantee 100% of the execution paths do not reach it. And it's assumed that this is a large enough piece of software that this guarantee is next to impossible. And unless I read the question wrong, the only time this whole thread of conversation is relevant is for cases during production where the removed code might get called and hence we have a runtime/production issue. Version control does not save anyone's behind due to production failures, so the comment about "Version Control" is not relevant to the question or my original point, i.e. properly deprecate over an extremely long time-frame if you really have to, otherwise don't worry because superfluous code adds relatively a very small cost in disk space bloat.

IMHO, the comment is superfluous and is a candidate for removal.

LMSingh
  • 277
1

Maybe the compiler does notice that, it just doesn't make a fuss.

Run a build with full compiler optimizations, geared towards size. Then delete the suspect code, and run the build again.

Compare the binaries. If they're identical, then the compiler has noticed and silently deleted the code. You may delete it from source safely.

If the binaries are different... then it's inconclusive. It could be some other change. Some compilers even include date and time of compilation in the binary (and maybe it can be configured away!)

1

I've actually recently come across this exact situation, with a method called "deleteFoo". Nowhere in the entire codebase was that string found other than the method declaration, but I wisely wrote a log line at the top of the method.

PHP:

public function deleteFoo()
{
    error_log("In deleteFoo()", 3, "/path/to/log");
}

It turns out that the code was used! Some AJAX method calls "method:delete,elem=Foo" which is then concatenated and called with call_user_func_array().

When in doubt, log! If you've gone sufficient time without seeing the log filled, then consider removing the code. But even if you remove the code, leave the log in place and a comment with the date to make finding the commit in Git easier for the guy who has to maintain it!

dotancohen
  • 1,071
0

Use grep or whatever tools you have available first, you might find references to the code. (Not originally my instruction/advice.)

Then decide if you want to risk removing the code, or if my suggestion could come to use:

You could modify the function/block of code to write that it has been used to a log file. If no such message is "ever" recorded in the log file, then you can probably remove the logging code.

Two problems:

  1. Getting the log from other users. Perhaps you can set a global flag that will crash the program on exit and kindly ask the user to send you the error log.

  2. Tracing the caller. In some high level languages (eg. Python) you can easily get a traceback. But in compiled languages you'll have to save the return address to the log and force a core dump on exit (point 1).
    In C, this should be quite simple: use a conditional block (eg. #ifdef ... #endif) to only use this when you know it'll work (and have tested that it does), and simply read the return address from the stack (may or may not require inline assembly language).

Saving the arguments in the log file may or may not be useful.

Oskar Skog
  • 174
  • 2
  • 8
0

If you know what the code surrounding it does, test it to see if it breaks. This is the simplest and most cost-effective way to refactor a piece of code you're working on, and should be done anyway as a matter of developing any change to the code you're working on in the first place.

If, on the other hand, you're refactoring a piece of code where you do not know what it does, do not remove it. Understand it first, then refactor it if you determine it to be safe (and only if you can determine that the refactoring would be a proper use of your time).

Now, there might be some circumstances (I know I've run into it myself) where the code that is 'dead' is actually not connected to anything. Such as libraries of code developed by a contractor that never actually got implemented anywhere because they became obsolete immediately after the app was delivered (why yes, I do have a specific example in mind, how did you know?).

I personally did wind up removing all of this code, but it is a huge risk that I do not recommend taking on lightly - depending on how much code this change could potentially impact (because there's always the potential that you've overlooked something) you should be extraordinarily careful and run aggressive unit tests to determine if removing this ancient legacy code will break your application.

Now all of that being said, it probably isn't worth your time or sanity to try and remove code like that. Not unless it's becoming a serious issue with your application (for example, not being able to complete security scans because of application bloat...why yes I DO still have an example in mind), so I wouldn't recommend it unless you reach such a situation where the code has truly started to rot in an impactful way.

So in short - if you know what the code does, test it first. If you don't, you can probably leave it alone. If you know you can't leave it alone, dedicate your time to test the change aggressively before implementing the change.

Zibbobz
  • 1,602
0

My approach, which I always assumed to be industry standard in this case, weirdly hasn't been mentioned so far:

Get a second pair of eyes.

There are different kinds of "unused code" and in some cases deleting is appropriate, in other cases you should refactor it, and in yet other cases you run away as far as you can and never look back. You need to figure out which one it is, and to do so you best pair up with a second - experienced! - developer, one who is very familiar with the code base. This significantly reduces the risk for an unnecessary mistake, and allows you to make the necessary improvements to keep the code base in a maintainable state. And if you don't do this, at some point you run out of developers who are very familiar with the code base.

I've seen the logging approach too - to be more exact, I've seen the logging code: even more dead code which was left in there for 10 years. The logging approach is quite decent if you're talking about removing entire features, but not if you're just removing a small piece of dead code.

Peter
  • 3,778
0

The simple answer to your question is you can't safely delete code you don't understand, which includes not understanding how it gets called. There will be some risk.

You will have to judge how best to mitigate that unavoidable risk. Others have mentioned logging. Logging can of course prove that it is used, but can't prove that it isn't. Since the major problem with dead code is that it gets maintained, you might be able to get away with adding a comment saying that it is suspected dead code, and not to do any maintenace on it.

If the code is under source control, you can always find out when it was added and determine how it was called then.

Ultimately, you either understand it, leave it alone, or take a chance.

jmoreno
  • 11,238
0

In case the developer of the code in question is still available, review the code removal with him. Also, read comments in the code.

You will get the proper explanation, why this code has been added and which function does it serve for. It can easily be support for the specific use case, or you even may not have enough expertise to judge if it is really not required. In extreme case, one may remove all free statements from a C program and fail to notice anything wrong (it may take few minutes to run out of memory).

It is really a bad culture when the author of the code sits next to you, yet you see no reason to talk because you are sure he just writes bad code, and yours is all so perfect. And, of course, he just writes random words in the comments, no need to waste time reading.

One of the most important reasons to write comment in the code is to explain WHY it has been implemented this way. You may disagree with the solution, but you need to take the problem into consideration.

Discussion with the author may also reveal that the code is the historical remnant of something removed or never finished, so is safe to delete.

h22
  • 965
-1

I strongly agree with Markus Lausberg's first point: The code should definitely be analyzed with the help of tools. Ape brains are not to be trusted with this kind of task!!

Most standard programming languages can be used in IDEs and every IDE that's worth being called that has a "find for all usages"-feature which is exactly the right tool for this kind of situation. (Ctrl+Shift+G in Eclipse for example)

Of course: Static analyzer tools are not perfect. One must be aware of more complicated situations where the decision that the method in question is being called happens only at runtime and no sooner (calls via Reflection, calls to "eval"-functions in scripting languages etc).

-1

Two possibilities:

  1. You have 99%+ test coverage: Delete the code and be done with it.
  2. You don't have trustworthy test coverage: Then the answer is to add a deprecation warning. I.e., you add some code to that place that does something sane (log a warning to some easily visible place that does not conflict with day-to-day-usage of your application, for example). Then let it simmer for a few months/years. If you never ever found any occurence of it, replace the deprecation by something that throws an exception. Let that stand for a while. Finally, remove everything completely.
AnoE
  • 5,874
  • 1
  • 16
  • 17