153

In object-oriented programming, there is of course no exact rule on the maximum length of a method , but I still found these two quotes somewhat contradicting each other, so I would like to hear what you think.

In Clean Code: A Handbook of Agile Software Craftsmanship, Robert Martin says:

The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that. Functions should not be 100 lines long. Functions should hardly ever be 20 lines long.

and he gives an example from Java code he sees from Kent Beck:

Every function in his program was just two, or three, or four lines long. Each was transparently obvious. Each told a story. And each led you to the next in a compelling order. That’s how short your functions should be!

This sounds great, but on the other hand, in Code Complete, Steve McConnell says something very different:

The routine should be allowed to grow organically up to 100-200 lines, decades of evidence say that routines of such length no more error prone then shorter routines.

And he gives a reference to a study that says routines 65 lines or long are cheaper to develop.

So while there are diverging opinions about the matter, is there a functional best-practice for you?

Spring
  • 1,763

14 Answers14

142

Functions should normally be short, between 5-15 lines is my personal "rule of thumb" when coding in Java or C#. This is a good size for several reasons:

  • It fits easily on your screen without scrolling
  • It's about the conceptual size that you can hold in your head
  • It's meaningful enough to require a function in its own right (as a standalone, meaningful chunk of logic)
  • A function smaller than 5 lines is a hint that you are perhaps breaking the code up too much (which makes it harder to read / understand if you need to navigate between functions). Either that or your're forgetting your special cases / error handling!

But I don't think it is helpful to set an absolute rule, as there will always be valid exceptions / reasons to diverge from the rule:

  • A one-line accessor function that performs a type cast is clearly acceptable in some situations.
  • There are some very short but useful functions (e.g. swap as mentioned by user unknown) that clearly need less than 5 lines. Not a big deal, a few 3 line functions don't do any harm to your code base.
  • A 100-line function that is a single large switch statement might be acceptable if it is extremely clear what is being done. This code can be conceptually very simple even if it requires a lot of lines to describe the different cases. Sometimes it is suggested that this should be refactored into separate classes and implemented using inheritance / polymorphism but IMHO this is taking OOP too far - I'd rather only have one big 40-way switch statement than 40 new classes to deal with, in addition to a 40-way switch statement to create them.
  • A complex function might have a lot of state variables that would get very messy if passed between different functions as parameters. In this case you could reasonably make an argument that the code is simpler and easier to follow if you keep everything in a single large function (although as Mark rightly points out this could also be a candidate for turning into a class to encapsulate both the logic and state).
  • Sometimes smaller or larger functions have performance advantages (perhaps because of inlining or JIT reasons as Frank mentions). This is highly implementation dependent, but it can make a difference - make sure you benchmark!

So basically, use common sense, stick to small function sizes in most instances but don't be dogmatic about it if you have a genuinely good reason to make an unusually big function.

Deduplicator
  • 9,209
mikera
  • 20,777
41

While I agree with other's comments when they said there's no hard rule about right LOC number, I bet if we look back at the projects we've looked at in the past and identify every function above, let's say 150 lines of code, I'm guessing we would come to a consensus that 9 out of 10 of those functions break SRP (and very likely OCP as well), have too many local variables, too much control flow and are generally hard to read and maintain.

So while LOC may not be a direct indicator of bad code, it is certainly a decent indirect indicator that certain function could be written better.

On my team I fell into the position of a lead and for whatever reason, people seem to be listening to me. What I generally settled on is to tell the team that while there's no absolute limit, any function more 50 lines of code should at a minimum raise a red flag during code review, so that we take a second look at it and re-evaluate it for complexity and SRP/OCP violations. After that second look, we might leave it alone or we might change it, but at least it makes people think about these things.

DXM
  • 20,022
22

I stepped into a project which hasn't had any care about coding guidelines. When I look into the code I sometimes find classes with more than 6000 lines of code and less than 10 methods. This is a horror scenario when you have to fix bugs.

A general rule of how big a method should be at maximum is sometimes not so good. I like the rule by Robert C. Martin (Uncle Bob): "Methods should be small, smaller than small". I try to use this rule all the time. I am trying to keep my methods simple and small by clarifying that my method does only one thing and nothing more.

Smokefoot
  • 753
16

I'll just throw in yet another quote.

Programs must be written for people to read, and only incidentally for machines to execute

-- Harold Abelson

It is very improbable that functions that grow to 100-200 follow this rule

Pete
  • 9,016
13

It depends, seriously, there really isn't a solid answer to this question because the language you are work with matters, the five to fifteenth lines mentioned in this answer might work for C# or Java, but in other languages it doesn't give you much to work with. Likewise, depending upon the domain you are working in, you might find yourself writing code setting values in a large data structure. With some data structures you might have tens of elements that you need to set, should you break things out in to separate functions just because your function is running long?

As others have noted, the best rule of thumb is that a function should be a single logical entity that handles a single task. If you try to enforce draconian rules that say that functions can't be longer than n lines and you make that value too small your code will grow harder to read as developers try and use fancy tricks to get around the rule. Likewise, if you set it too high it will be a non-issue and can lead to bad code though laziness. Your best bet is to just conduct code reviews to ensure that functions are handling a single task and leave it at that.

rjzii
  • 11,304
11

It is not about number of lines, it is about SRP. According to this principle, your method should do one and only one thing.

If your method does this AND this AND this OR that => it is probably doing to much. Try to look at this method and analyse: "here I get this data, sort it and get elements I need" and "here I process these elements" and "here I finally combine them in order to get the result". These "blocks" should be refactored to other methods.

If you simply follow SRP most of your method will be small and with clear intention.

It is not correct to say "this method is > 20 lines so it is wrong". It can be an indication that something may be wrong with this method, no more.

You may have a 400 lines switch in a method (often happens in telecom), and it is still single responsibility and it is perfectly OK.

Alexey
  • 119
10

I think one problem here is that the length of a function says nothing about its complexity. LOC (Lines of Code) are a bad instrument to measure anything.

A method should not be overly complex, but there are scenarios where a long method can be easily maintained. Note that the following example does not say it could not be split into methods, just that the methods would not change the maintainability.

for example a handler for incoming data can have a large switch statement and then simple code per case. I have such code - managing incoming data from a feed. 70 (!) numerically coded handlers. Now, one will say "use constants" - yes, except the API does not provide them and I like to stay close to "source" here. Methods? Sure - just sadly all of them deal with data from the same 2 huge structures. No benefit in splitting them off except maybe having more methods (readability). The code is intrinsically not complex - one switch, depending on a field. Then every case has a block that parses x elements of data and publishes them. No maintenance nightmare. There is one repeating" if condition that determiens whether a field has data (pField = pFields [x], if pField->IsSet() { blabla }) - same pretty much for every field...

Replace that with a much smaller routine containing nested loop and a lot of real switching statements and a huge method can be more easy to maintain than one smaller one.

So, sorry, LOC is not a good measurement to start with. If anything, then complexity / decision points should be used.

TomTom
  • 545
7

If I find long method, I could bet this method is not properly unit-tested or most time it hasn't unit test at all. If you start doing TDD you will never build up 100-lines methods with 25 different responsibilities and 5 nested loops. Tests oblige you constantly refactor your mess and write uncle Bob's clean code.

7

I've been in this crazy racket, one way or another, since 1970.

In all that time, with two exceptions that I'll get to in a moment, I have NEVER seen a well-designed "routine" (method, procedure, function, subroutine, whatever) that NEEDED to be more than one printed page (about 60 lines) long. The vast majority of them were quite a bit short, on the order of 10-20 lines.

I have, however, seen a LOT of "stream-of-consciousness" code, written by people who apparently never heard of modularization.

The two exceptions were very much special cases. One is actually a class of exception cases, that I lump together: large finite-state automata, implemented as big ugly switch statements, usually because there isn't a cleaner way to implement them. These things usually show up in automated test equipment, parsing data logs from the device under test.

The other was the photon torpedo routine from the Matuszek-Reynolds-McGehearty-Cohen STARTRK game, written in CDC 6600 FORTRAN IV. It had to parse the command line, then simulate the flight of each torpedo, with perturbations, check the interaction between the torpedo and each kind of thing it could hit, and oh by the way simulate recursion to do 8-way connectivity on chains of novae from torpedoing a star that was next to other stars.

3

Do the authors mean the same thing by "function" and "routine"? Typically when I say "function" I mean a subroutine/operation which returns a value and "procedure" for one which does not (and whose call becomes a single statement). This is not a common distinction throughout SE in the real world but I have seen it in user texts.

Either way, there is no right answer to this. Preference for one or the other (if there is a preference at all) is something I would expect to be very different between languages, projects, and organizations; just as it is with all code conventions.

The one bit I would add is that the whole "long operations are no more error-prone than short operations" assertion is not strictly true. In addition to the fact that more code equals more potential error space, it is blindingly obvious that breaking code into segments will make errors both easier to avoid and easier to locate. Otherwise there would be no reason to break code into pieces at all, save repetition. But this is perhaps true only if said segments are documented well enough that you can determine the results of an operation call without reading through or tracing the actual code (design-by-contract based on specifications rather than concrete dependency between areas of code).

Additionally, if you want longer operations to work well, you might want to adopt stricter code conventions to support them. Tossing a return statement in the middle of an operation might be fine for a short operation, but in longer operations this can create a large section of code which is conditional but not obviously conditional on a quick read-through (just for one example).

So I would think that which style is less likely to be a bug-filled nightmare would depend in large part on what conventions you adhere to for the rest of your code. :)

2

There is no absolute rules about method's length, but the following rules have been useful:

  1. Function's primary purpose is to find the return value. There is no other reason for it's existence. Once that reason is fullfilled, no other code should be inserted to it. This necessarily keeps functions small. Calling other functions should only be done if it makes finding the return value easier.
  2. On the other hand, interfaces should be small. This means you either have large number of classes, or you have large functions -- one of the two is going to happen once you start to have enough code to do anything significant. Big programs can have both.
tp1
  • 1,932
  • 11
  • 10
1

IMHO, you shouldn't have to use scrollbar to read your function. As soon as you need to move scrollbar, it take a few more time to understand how work the function.

Accordingly, it depends of usual programming environment of your team work (screen resolution, editor, font size, etc...). In 80's, it was 25 lines and 80 columns. Now, on my editor, I display nearly 50 lines. Number of columns I display did not change since I split my screen in two to display two files at times.

In brief, it depends of setup of your coworkers.

1

I think TomTom's answer came close to how I feel about it.

More and more I find myself going on cyclomatic complexity rather than lines.

I normally aim for no more than one control structure per method, with the exception of however many loops it takes to handle a multi-dimensional array.

I sometimes find myself putting one-line ifs in switch cases because for some reason these tend to be cases where splitting it out hinders rather than helps.

Note that I do not count guard logic against this limit.

-1

In OOP all things object and there are these features:

  1. Polymorphism
  2. Abstraction
  3. Inheritance

When you observance these rules then your methods is usually small but don't exist any rule for small or very small (e.g 2-3 line) rules. A benefit of small method(small unit eg method or function) are:

  1. better readable
  2. maintain better
  3. fixed bug better
  4. changes better
Sam
  • 201