54

When implementing the Builder Pattern, I often find myself confused with when to let building fail and I even manage to take different stands on the matter every few days.

First some explanation:

  • With failing early I mean that building an object should fail as soon as an invalid parameter is passed in. So inside the SomeObjectBuilder.
  • With failing late I mean that building an object only can fail on the build() call that implicitely calls a constructor of the object to be built.

Then some arguments:

  • In favor of failing late: A builder class should be no more than a class that simply holds values. Moreover, it leads to less code duplication.
  • In favor of failing early: A general approach in software programming is that you want to detect issues as early as possible and therefore the most logical place to check would be in the builder class' constructor, 'setters' and ultimately in the build method.

What is the general concensus about this?

skiwi
  • 1,138
  • 3
  • 10
  • 14

5 Answers5

44

Let's look at the options, where we can place the validation code:

  1. Inside the setters in builder.
  2. Inside the build() method.
  3. Inside the constructed entity: it will be invoked in build() method when the entity is being created.

Option 1 allows us to detect problems earlier, but there can be complicated cases when we can validate input only having the full context, thus, doing at least part of validation in build() method. Thus, choosing option 1 will lead to inconsistent code with part of validation being done in one place and another part being done in other place.

Option 2 isn't significantly worse than option 1, because, usually, setters in builder are invoked right before the build(), especially, in fluent interfaces. Thus, it's still possible to detect a problem early enough in most cases. However, if the builder is not the only way to create an object, it will lead to duplication of validation code, because you'll need to have it everywhere where you create an object. The most logical solution in this case will be to put validation as close to created object as possible, that is, inside of it. And this is the option 3.

From SOLID point of view, putting validation in builder also violates SRP: the builder class already has responsibility of aggregating the data to construct an object. Validation is establishing contracts on its own internal state, it's a new responsibility to check the state of another object.

Thus, from my point of view, not only it's better to fail late from design perspective, but it's also better to fail inside the constructed entity, rather than in builder itself.

UPD: this comment reminded me of one more possibility, when validation inside the builder (option 1 or 2) makes sense. It does make sense if the builder has its own contracts on the objects it is creating. For example, assume that we have a builder that constructs a string with specific content, say, list of number ranges 1-2,3-4,5-6. This builder may have a method like addRange(int min, int max). The resulting string does not know anything about these numbers, neither it should have to know. The builder itself defines the format of the string and constraints on the numbers. Thus, the method addRange(int,int) must validate the input numbers and throw an exception if max is less than min.

That said, the general rule will be to validate only the contracts defined by the builder itself.

Ivan Gammel
  • 1,256
38

Given that you use Java, consider authoritative and detailed guidance provided by Joshua Bloch in the article Creating and Destroying Java Objects (bold font in below quote is mine):

Like a constructor, a builder can impose invariants on its parameters. The build method can check these invariants. It is critical that they be checked after copying the parameters from the builder to the object, and that they be checked on the object fields rather than the builder fields (Item 39). If any invariants are violated, the build method should throw an IllegalStateException (Item 60). The exception's detail method should indicate which invariant is violated (Item 63).

Another way to impose invariants involving multiple parameters is to have setter methods take entire groups of parameters on which some invariant must hold. If the invariant isn't satisfied, the setter method throws an IllegalArgumentException. This has the advantage of detecting the invariant failure as soon as the invalid parameters are passed, instead of waiting for build to be invoked.

Note according to editor explanation on this article, "items" in above quote refer to rules presented in Effective Java, Second Edition.

The article doesn't dive deep into explaining why this is recommended, but if you think of it, the reasons are pretty apparent. Generic tip on understanding this is provided right there in the article, in the explanation how builder concept is connected to that of constructor - and class invariants are expected to be checked in constructor, not in any other code that may precede / prepare its invocation.

For a more concrete understanding for why checking invariants prior to invoking a build would be wrong, consider a popular example of CarBuilder. Builder methods may be invoked in an arbitrary order and as a result, one can't really know whether particular parameter is valid until the build.

Consider that sports car can't have more than 2 seats, how could one know if setSeats(4) is okay or not? It's only at the build when one can know for sure whether setSportsCar() was invoked or not, meaning whether to throw TooManySeatsException or not.

gnat
  • 20,543
  • 29
  • 115
  • 306
21

Invalid values which are invalid because they are not tolerated should be made known immediately in my opinion. In other words, if you accept only positive numbers, and a negative number is passed in, there is no need to have to wait until build() is called. I wouldn't consider these the types of problems you would "expect" to have happen, since it is a prerequisite to calling the method to begin with. In other words, you would not likely depend on the failing of setting certain parameters. More likely you would presume the parameters are correct or you'd perform some check yourself.

However, for more complicated issues which aren't as easily validated may be better off being made known when you call build(). A good example of this might be using the connection information you provide to establish a connection to a database. In this case, while you technically could check for such conditions, it is no longer intuitive and it only complicates your code. As I see it, these are also the types of issues which might actually happen and that you cannot really anticipate until you try it. It's sort of the difference between matching a string with a regular expression to see if it could be parsed as an int and simply trying to parse it, handling any potential exceptions which may occur as a consequence.

I generally dislike throwing exceptions when setting parameters since it means having to catch any exception thrown, so I tend to favor validation in build(). So for this reason, I prefer using RuntimeException since again, errors in parameters passed should not generally happen.

However, this is more of a best practice than anything. I hope that answers your question.

Neil
  • 22,848
12

As far as I know, the general practice (not sure if there's consensus) is to fail as early as you can feasibly discover an error. This also makes it more difficult to unintentionally misuse your API.

If it is a trivial attribute that can be checked on input, such as a capacity or length which should be non-negative, then you're best to fail immediately. Holding off the error increases the distance between mistake and feedback, which makes it more difficult to find the source of the problem.

If you have the misfortune of being in a situation where the validity of an attribute depends on others, then you have two choices:

  • Require that both (or more) attributes be supplied simultaneously (i.e. single method invocation).
  • Test validity as soon as you know no more changes are incoming: when build() or so is called.

As with most things, this is a decision made in a context. If the context makes it awkward or complicated to fail early, a trade-off can be made to defer checks to a later time, but fail-fast should be the default.

JvR
  • 1,204
3

The basic rule is "fail early".

The slightly more advanced rule is "fail as early as possible".

If a property is intrinsically invalid ...

CarBuilder.numberOfWheels( -1 ). ...  

... then you reject it immediately.

Other cases might need values to be checked in combination and might be better placed in the build() method:

CarBuilder.numberOfWheels( 0 ).type( 'Hovercraft' ). ...  
Phill W.
  • 13,093