Why is Clean Code suggesting avoiding protected variables?

Question

Clean Code suggests avoiding protected variables in the "Vertical Distance" section of the "Formatting" chapter:

Concepts that are closely related should be kept vertically close to each other. Clearly this rule doesn't work for concepts that belong in separate files. But then closely related concepts should not be separated into different files unless you have a very good reason. Indeed, this is one of the reasons that protected variables should be avoided.

What is the reasoning?

score 300 · Accepted Answer · edited Oct 26 '12 at 16:17

Protected variables should be avoided because:

They tend to lead to YAGNI issues. Unless you have a descendant class that actually does stuff with the protected member, make it private.
They tend to lead to LSP issues. Protected variables generally have some intrinsic invariance associated with them (or else they'd be public). Inheritors then need to maintain those properties, which people can screw up or willfully violate.
They tend to violate OCP. If the base class makes too many assumptions about the protected member, or the inheritor is too flexible with the behavior of the class, it can lead to the base class' behavior being modified by that extension.
They tend to lead to inheritance for extension rather than composition. This tends to lead to tighter coupling, more violations of SRP, more difficult testing, and a slew of other things that fall within the 'favor composition over inheritance' discussion.

But as you see, all of these are 'tend to'. Sometimes a protected member is the most elegant solution. And protected functions tend to have fewer of these issues. But there are a number of things that cause them to be treated with care. With anything that requires that sort of care, people will make mistakes and in the programming world that means bugs and design problems.

score 63 · Answer 2 · answered Aug 28 '12 at 16:06

It's really the same reason you avoid globals, just on a smaller scale. It's because it's hard to find everywhere a variable is being read, or worse, written to, and hard to make the usage consistent. If the usage is limited, like being written in one obvious place in the derived class and read in one obvious place in the base class, then protected variables can make the code more clear and concise. If you're tempted to read and write a variable willy-nilly throughout multiple files, it's better to encapsulate it.

Robert Harvey · Answer 3 · 2012-09-01T17:02:30.213

I haven't read the book, but I can take a stab at what Uncle Bob meant.

If you put protected on something, that means that a class can inherit it. But member variables are supposed to belong to the class in which they are contained; this is part of basic encapsulation. Putting protected on a member variable breaks encapsulation because now a derived class has access to the implementation details of the base class. It's the same problem that occurs when you make a variable public on an ordinary class.

To correct the problem, you can encapsulate the variable in a protected property, like so:

protected string Name
{
    get { return name; }
    private set { name = value; }
}

This allows name to be safely set from a derived class using a constructor argument, without exposing implementation details of the base class.

score 21 · Answer 4 · edited Oct 26 '12 at 16:36

Uncle Bob's argument is primarily one of distance: if you have a concept that is important to a class, bundle the concept together with the class in that file. Not separated across two files on the disk.

Protected member variables are scattered in two places, and, kinda looks like magic. You reference this variable, yet it isn't defined here... where is it defined? And thus the hunt begins. Better to avoid protected altogether, his argument.

Now I don't believe that rule needs to be obeyed to the letter all the time (like: thou shalt not use protected). Look at the spirit of what he is getting at... bundle related things together into one file - use programming techniques and features to do that. I would recommend that you don't over-analyze and get caught up in the details on this.

Doc Brown · Answer 5 · 2012-08-28T17:04:15.893

Some very good answers already here. But one thing to add:

In most modern OO languages it is a good habit (in Java AFAIK its necessary) to put each class into its own file (please don't get nitty about C++ where you often have two files).

So it is virtually impossible to keep the functions in a derived class acessing a protected variable of a base class "vertically close" in source code to the variable definition. But ideally they should be kept there, since protected variables are intended for internal use, which makes functions acessing them often "a closely related concept".

score 7 · Answer 6 · edited Feb 27 '22 at 21:39

It's an interesting discussion.

To be frank, good tools can mitigate many of these "vertical" issues. In fact, in my opinion the notion of the "file" is actually something that hinders software development - something the Light Table project is working on (https://chris-granger.com/2012/04/12/light-table-a-new-ide-concept/).

Take files out of the picture, and the protected scope becomes a lot more appealing. The protected concept becomes most clear in langauges like SmallTalk - where any inheritance simply extends the original concept of a class. It should do everything, and have everything, that the parent class did.

In my opinion, class hierarchies should be as shallow as possible, with most extensions coming from composition. In such models, I don't see any reasons for private variables to not be made protected instead. If the extended class should represent an extension including all behaviour of the parent class (which I believe it should, and therefore believe private methods are inherently bad), why should the extended class not represent the storage of such data as well?

To put this another way - in any instance where I feel a private variable would suit better than a protected one, inheritence is not the solution to the problem in the first place.

score 6 · Answer 7 · answered Aug 29 '12 at 01:35

The basic idea is that a "field" (instance-level variable) that is declared as protected is probably more visible than it has to be, and less "protected" than you might like. There is no access modifier in C/C++/Java/C# that is the equivalent of "accessible only by child classes within the same assembly", thus granting you the ability to define your own children that can access the field in your assembly, but not allowing children created in other assemblies the same access; C# has internal and protected modifiers, but combining them makes the access "internal or protected", not "internal and protected". So, a field that is protected is accessible by any child, whether you wrote that child or someone else did. Protected is thus an open door to a hacker.

Also, fields by their definition have pretty much no validation inherent in changing them. in C# you can make one readonly, which makes value types effectively constant and reference types unable to be reinitialized (but still very mutable), but that's about it. As such, even protected, your children (which you cannot trust) have access to this field and can set it to something invalid, making the object's state inconsistent (something to be avoided).

The accepted way to work with fields is to make them private and access them with a property, and/or a getter and setter method. If all consumers of the class need the value, make the getter (at least) public. If only children need it, make the getter protected.

Another approach that answers the question is to ask yourself; why does code in a child method need the ability to modify my state data directly? What does that say about that code? That's the "vertical distance" argument on its face. If there is code in a child that must directly alter parent state, then maybe that code should belong to the parent in the first place?

score 1 · Answer 8 · answered Aug 29 '12 at 09:07

I find using protected variables for JUnit tests can be useful. If you make them private then you cannot access them during testing without reflection. This can be useful if you are testing a complex process through many state changes.

You could make them private, with protected getter methods, but if the variables are only going to be used externally during a JUnit test, I'm not sure that's a better solution.

score 0 · Answer 9 · answered Aug 28 '12 at 20:43

As it says there, this is only one of the reasons, that is, logically linked code should be placed inside physical linked entities (files, packages and others). On a smaller scale this is the same as the reason you do not put a class which makes DB queries inside the package with classes that displays the UI.

The main reason however that protected variables are not recommended is because they tend to break encapsulation. Variables and methods should have as limited visibility as possible; for further reference see Joshua Bloch's Effective Java, Item 13 - "Minimize the accessibility of classes and members".

However, you should not take all of this ad litteram, only as a guildline; if protected variables are very bad they would not have been placed inside the language in the first place. A reasonable place for protected fields imho is inside the base class from a testing framework (integration test classes extend it).

score -1 · Answer 10 · answered Aug 28 '12 at 20:32

Yes, I agree that is somewhat strange given that (as I recall) the book also discusses all non-private variables as something to be avoided.

I think the issue with the protected modifier is that not only is the member exposed to subclasses it's also made visible to the entire package. I think it is this dual aspect that Uncle Bob is taking exception to here. Of course, one can't always get around having protected methods, and thus the protected variable qualification.

I think a more interesting approach is to make everything public, that'll force you to have small classes, which in turn renders the whole vertical distance metric somewhat moot.

Why is Clean Code suggesting avoiding protected variables?

10 Answers10

Linked