28

This doubt is about Switch Statements from Chapter 3: Functions of the book named Clean Code

Here we have a function:

public Money calculatePay(Employee e)
throws InvalidEmployeeType {
    switch (e.type) {
      case COMMISSIONED:
        return calculateCommissionedPay(e);
      case HOURLY:
        return calculateHourlyPay(e);
      case SALARIED:
        return calculateSalariedPay(e);
      default: 
        throw new InvalidEmployeeType(e.type);
    }
}

From my inexperienced point of view, I can see that the switch statement inside the function calculatePay is only returning things based on the object Employee. Isn't it doing the "One Thing" mentioned by Uncle Bob?

But Uncle Bob says:

There are several problems with this function. First, it's large, and when new employee types are added, it will grow. Second, it very clear does more than more thing. Third, it violates the Single Responsibility Principle(SRP) because there is more than one reason for it to change. Fourth, it violates the Open Closed Principle(OCP) because it must change whenever new types are added. But possibly the worst problem with this function is that there are an unlimited number of other functions that will have the same structure.

How does the switch statement do more than one thing?

Malady
  • 109

7 Answers7

60

Martin's concept of "do one thing" is overly ambiguous to the point I believe it does more harm than good.

In the passage Martin states that a switch does one thing per case and therefore by definition does N things. This implies that any single method call is "one thing". If you follow this thinking, you will quickly realize a program will never be able to do anything!

Martin has a different definition which is that a method does "one thing" when all operations are on the same abstraction level. But the cases here calculateCommissionedPay(), calculateHourlyPay() do seem to be on the same abstraction level, so this contradicts his general criticism of switch-statments as always doing N-things.

That said, there are reasons to avoid this particular use of a switch. The switch check a type-field and then selects the method to call based on that. The idiomatic way to do this in object oriented programming is to use subclasses and overloading. Employee could have subclasses HourlyEmployee, SalariedEmployee etc., which override a calculatePay() method. That way you could avoid the switch altogether and just call e.calculatePay().

But if the input really is an enum value as in the example, then you need a way to get the appropriate Employee-subclass given this enum value. How do you do that? A switch of course! So you end up with code something like this:

public Employee createEmployee(int employeeType)
throws InvalidEmployeeType {
    switch (employeeType) {
      case COMMISSIONED:
        return new CommissionEmployee(e);
      case HOURLY:
        return new HourlyEmployee(e);
      case SALARIED:
        return new SalariedEmployee(e);
      default: 
        throw new InvalidEmployeeType(e.type);
    }
}

public Money calculatePay(int employeeType) throws InvalidEmployeeType { Employee e = createEmployee(e.type); return e.calculatePay(); }

You will notice a few things:

  1. We still have a switch which allegedly does "N-things".
  2. The switch will still have to grow when new employment types are added.
  3. We still have a Open/Closed violation, since adding a new subclass will require us to modify the switch, just as before.

But if there are multiple places in the code were we switch on the employee type as in the first example, this refactoring is a clear improvement since we now only need one single switch and can use overloading in the other places.

If it is not clear from the above, I reject the premise that switch statements are bad and should be avoided in general. Sometimes it is the right tool for the job. But certain uses of switch is an anti-pattern, for example using switch as a substitute for overloading, when overloading would be more appropriate.

JacquesB
  • 61,955
  • 21
  • 135
  • 189
16

The calculatePay in your question is doing indeed too much:

  • It extracts the type of the employee.
  • It calls calculateCommissionedPay in one case.
  • And it also calls calculateHourlyPay in another case.
  • It also knows about calculateSalariedPay.
  • And finally it handles the default case by constructing and throwing an exception.

The nice thing is that calculatePay doesn't need to do all that. If the application was architected correctly, there wouldn't be any Employee.type. Instead, there would be classes like ComissionedEmployee, HourlyEmployee and so on which implement the interface Employee. The interface would have a calculatePay method, and the different classes will provide the actual implementation of this method.

This is essentially what Replace conditional with polymorphism section in Refactoring by Martin Fowler is all about: relying on polymorphism instead of picking an execution path based on the type of an object.

This change would reduce calculatePay to one line:

public Money calculatePay(Employee e) {
    return e.calculatePay();
}

Then, it can be removed: the caller of this function can call Employee.calculatePay method directly.

This is essentially the example you see in Clean code in Listing 3-5, on the next page.

12

Everyone complains about Robert Martins "one thing". Let me try to demystify it.

calculatePay is one thing. Just the name is enough to set us on the path of expecting one thing. When I look inside this I expect to find things that calculatePay.

These seem like they're focused on our one thing: calculateCommissionedPay(e), calculateHourlyPay(e), and calculateSalariedPay(e) and indeed they help us get there but these are not focused on the whole of our one thing. They are smaller cases of it. They're details. We just want to calculate pay. We don't care why you're payed what you're payed. We just want to calculate it.

So what if e had e.wages() and e.bonus() hanging off of it? Could we add them together here or are those evil details? Well when wouldn't you add them together? This isn't some weird part of calculatePay. This is calculatePay.

One thing isn't something that can't be decomposed. That's what's really wrong with the example Mr. Martin gives. It makes it seem like each function should only be one line. Which is nonsense. One thing means this code should tell me one simple understandable story without dragging me through weird cases and up and down through wildly different abstractions. Do the work it takes to make the function simple enough to look like it's only doing one thing. Even if e.wages() is hiding tons of polymorphism, nasty switch cases, or 100 different levels of object oriented abstraction. At least here it's just e.wages() that you can add to e.bonus(), whatever that really is.

candied_orange
  • 119,268
9

Uncle Bob's criticisms are mostly valid, but his proposal to make payment calculation a virtual method tied to the employment type makes the code less flexible.

Having the pay calculation depend on a piece of data rather than on type, makes it easier to convert employees. For example, if a commissioned sales person gets promoted to a salaried management position, do you really want to construct a new employee and destroy the old one? Or would you rather change one field in the existing employee record.

Better Option 1 -- Composition

Instead of coupling the payment algorithm to the employee type, tie it to the employee's wage type. This still solves the problem with polymorphism, but it keys off of mutable data field rather than coupling it to an employee type (which is much less malleable).

class WageStrategy {
  public:
    virtual Money CalculatePay(const Employee &e) const = 0;
};

class SalaryWages : public WageStrategy { public: Money CalculatePay(const Employee &e) const override; }

class HourlyWages : public WageStrategy { public: Money CalculatePay(const Employee &e) const override; }

class CommissionWages : public WageStrategy { public: Money CalculatePay(const Employee &e) const override; }

class Employee { public: Money CalculatePay() const { return m_wages.CalculatePay(this); } private: WageStrategy m_wages; };

Better Option 2 -- Parameterization

Is it worth having independent implementations of wage calculations? If the company comes up with a new compensation plan (e.g., small base salary plus commissions), so you really want to create another type? Do you like having to search all over the code to know all the different ways wages can be computed? What if we just parameterized the business logic like this:

Money CalculatePay(const Employee &e) {
  Money pay = 0;

pay += e.GetBaseSalary(); for (const auto &sale : e.GetSales()) { pay += e.GetCommissionRate() * sale.GetTotal(); } pay += e.GetHoursWorked() * e.GetHourlyRate();

return pay; }

No switch statements. No polymorphism. Data-driven so you can effectively create new pay strategies (especially hybrid ones) without code changes--that's even better than open-closed. A single path of execution provides clarity and simple code coverage testing.

3

While I generally don't agree with Martin on this point, here's one way of looking at the situation which might be useful:

"This piece of logic acts as a gatekeeper." In an out-of-the-way place where you might never think to look for it.

This piece of logic, "although it is not part of an Employee object," is nonetheless cognizant of the employee-type, such that it determines which one of several "fairly beefy" subroutines are to be called.

So – if you're "chasing a bug," and you think that it must be located in one of these routines, there's one critical piece of information that you might be missing: "under what conditions are they actually called?" This piece of logic ties all of this to the value of e.type, but it's external to the place where such logic properly belongs: "the employee object."

Likewise, these various subroutines really ought to be methods of that same object, so that it's now possible to open "just one source-file ... the one which implements this object ... and to see everything that you need to know to understand it all, and to know that you have actually done so."

1

I understand where you're coming from, but in this particular case, you're focusing on the wrong thing. The main point he's trying to illustrate in this section is that this should be treated as a code smell (a strong indication of possible unwanted coupling) because there's likely a bunch of other functions in the system that have (literally or conceptually) the same switch statement in them. He lists, as a hypothetical example, isPayday(Employee e, Date date), and deliverPay(Employee e, Money pay), stating that there are other functions like that in the codebase. They all switch on employee.

This kind of coupling is extremely common in the real world.

And it's bad because you have to hunt down and change all of them if, say, you need to introduce a new case (or if you face any kind of change that requires the same kind of behavior to be repeated in all of these places). And nothing guarantees that these changes won't propagate outside of those functions. This is how you end up having to modify 20 different files because of a "simple" change.

The main point is to DRY out this (literal or conceptual) repetition; the book goes on to suggest to solve this by confining the switch statement to a single factory that returns a polymorphic result.

The "does more than one thing" is an offhand remark in this section of the book, probably best interpreted in the context of this statement at the start of the section: "By their nature, switch statements always do N things" - presumably because of their N cases.

So, sometimes it calculates a commissioned pay, sometimes a hourly pay, and sometimes a salaried pay. Building upon our discussion here (about the idea that implementations shouldn't be jumping levels of abstraction), while these are all at the same level of abstraction, they are conceptually distinct things: they aren't a group of functions that together achieve one overall task (they don't form steps of an overall operation), instead they are largely operationally unrelated, operating on a case-by-case basis1. As Daniel Wagner noticed in a comment to your question: "$/hour, $/year, and $/project are super meaningfully different"; these two problems are connected.


1 The concept of single responsibility is a rule of thumb intimately associated with the concept of cohesion. Cohesion is a measure of how closely related things in a module (function, class, component, ...) are. What you want is internally strongly cohesive, but externally loosely coupled modules (functions, classes, components, ...). If elements are not strongly related (and change for different reasons), as the codebase evolves, you (1) get internal coupling that makes changing each thing independently hard, but also (2) it's likely that elements that are related to each thing in that particular module were spread throughout several other modules (so, there's external coupling). The strongest form of cohesion is when there's an operational relationship - a number of functions that combine to accomplish a single overall task. Here you don't have that, you have logical relatedness instead (they all do the same sort of thing), and that's not a strong form of cohesion.

Quote from (a):

Another wording for the Single Responsibility Principle is:
Gather together the things that change for the same reasons. Separate those things that change for different reasons.

Sources:
(a) The Clean Code Blog: The Single Responsibility Principle,
(b) SRP: The Single Responsibility Principle (incidentally, this is the broken link from the book)


Depending on the language, you can't always fix all of the code smells, but sticking a switch statement into a factory turns this from a design that causes a proliferation of the same switch statement to something that converts a key into a specialized object focused only on one particular case, an object that you can pass around your application for later use by more self-contained components.

P.S. "Second, it very clearly does more than one thing." - I guess another takeaway from this is, if you're trying to explain something to other people (or if you're writing a book), never assume that what's clear to you is also obvious to everyone else :D

0

I agree whith those answers who point out that this specific case would be handled better by using inheritance.

But on the other hand you can't rely always on inheritance. I disagree with the idea that the switch statement violates the SRP. The switch statement acts as a dispatcher to the proper action when the options are not just black and white. I don't know what was written in the book, if it was a warning against placing a lot of code within the case block instead of a single call that would be understandable. But criticising the switch only because it has more than one case seems overly fanatic, that would rule out a lot of common practices that provided a lot of clean code for many years. What would the propoents of this interpretation think of the Scala style pattern matching?

FluidCode
  • 821
  • 4
  • 10