276

Is the visibility private of class fields/properties/attributes useful? In OOP, sooner or later, you are going to make a subclass of a class and in that case, it is good to understand and be able to modify the implementation completely.

One of the first things I do when I subclass a class is to change a bunch of private methods to protected. However, hiding details from the outer world is important – so we need protected too and not just public.

My question is: Do you know about an important use case where private instead of protected is a good tool, or would two options "protected & public" be enough for OOP languages?

Utku
  • 1,912
Adam Libuša
  • 2,077
  • 2
  • 12
  • 14

17 Answers17

260

In OOP, sooner or later, you are going to make a subclass of a class

This is wrong. Not every class is meant to be subclassed and some statically typed OOP languages even have features to prevent it, e.g., final (Java and C++) or sealed (C#).

it is good to understand and being able to modify the implementation completely.

No, it's not. It's good for a class to be able to clearly define its public interface and preserve its invariants even if it is inherited from.

In general, access control is about compartmentalization. You want an individual part of the code to be understood without having to understand in detail how it interacts with the rest of the code. Private access allows that. If everything is at least protected, you have to understand what every subclass does in order to understand how the base class works.

Or to put it in the terms of Scott Meyers: private parts of a class are affected by a finite amount of code: the code of the class itself.

Public parts are potentially affected by every bit of code in existence, and every bit of code yet to be written, which is an infinite amount of code.

Protected parts are potentially affected by every existing subclass, and every subclass yet to be written, which is also an infinite amount of code.

The conclusion is that protected gives you very little more than public, whereas private gives you a real improvement. It is the existence of the protected access specifier that is questionable, not private.

235

Because as you say, protected still leaves you with the ability to "modify the implementation completely". It doesn't genuinely protect anything inside the class.

Why do we care about "genuinely protecting" the stuff inside the class? Because otherwise it would be impossible to change implementation details without breaking client code. Put another way, people who write subclasses are also "the outer world" for the person who wrote the original base class.

In practice, protected members are essentially a class' "public API for subclasses" and need to remain stable and backwards compatible just as much as public members do. If we did not have the ability to create true private members, then nothing in an implementation would ever be safe to change, because you wouldn't be able to rule out the possibility that (non-malicious) client code has somehow managed to depend on it.

Incidentally, while "In OOP, sooner or later, you are going to make a subclass of a class" is technically true, your argument seems to be making the much stronger assumption that "sooner or later, you are going to make a subclass of every class" which is almost certainly not the case.

Ixrec
  • 27,711
35

Yes, private fields are absolutely necessary. Just this week I needed to write a custom dictionary implementation where I controlled what was put into the dictionary. If the dictionary field were to be made protected or public, then the controls I'd so carefully written could have been easily circumvented.

Private fields are typically about providing safeguards that the data is as the original coder expected. Make everything protected/public and you ride a coach and horses through those procedures and validation.

Robbie Dee
  • 9,823
13

When attempting to reason formally about the correctness of an Object Oriented program it is typical to use a modular approach involving object invariants. In this approach

  1. Methods have associated with them pre and post conditions (contracts).
  2. Objects have associated with them invariants.

Modular reasoning about an object proceeds as follows (to a first a approximation at least)

  1. Prove that the object's constructor establishes the invariant
  2. For each non-private method, assume the object invariant and method precondition hold on entry, then prove that the body of the code implies that the postcondition and invariant hold on method exit

Imagine that we verify an object A using the approach above. And now wish to verify method g of object B which calls method f of object A. Modular reasoning allows us to reason about method g without having to reconsider the implementation of method f. Provided we can establish the invariant of object A and precondition of method f at the call site in method g, we can take the post condition of method f as a summary of the behaviour of the method call. Moreover we will also know that after the call returns the invariant of A still holds.

This modularity of reasoning is what allow us to think formally about large programs. We can reason about each of the methods individually and then compose the results of this reasoning in turn to reason about larger parts of the program.

Private fields are very useful in this process. In order to know that the invariant of an object continues to hold between two method calls on that object, we typically rely on the fact that the object is not modified in the intervening period.

For modular reasoning to work in a context where objects do not have private fields then we would have to have some way to ensure that whatever a field happened to be set to by another object, that the invariant was always re-established (after the field set). It is difficult to imagine an object invariant that both holds no matter what value the object's fields have, and is also useful in reasoning about the program's correctness. We would probably have to invent some complicated convention around field access. And probably also lose some of (at worst even all of) our ability to reason modularly.

Protected fields

Protected fields restore some of our ability to reason modularly. Depending on the language protected may restrict the ability to set a field to all subclasses or all subclasses and same-package classes. It is often the case that we do not have access to all subclasses when we are reasoning about the correctness of an object we are writing. For example, you might be writing a component or library that will later be used in a larger program (or several larger programs) — some of which may not even have been written yet. Typically you will not know if and in what ways it may be sub-classed.

However, it is usually incumbent on a subclass to maintain the object invariant of the class it extends. So, in a language where protect means "sub-class" only, and where we are disciplined to ensure that sub-classes always maintain the invariants of their superclass, you could argue that the choice of using protected instead of private loses only minimal modularity.

Although I have been talking about formal reasoning, it is often thought that when programmers informal reason about the correctness of their code they also sometimes rely on similar types of arguments.

9

private variables in a class are better than protected for the same reason that a break statement inside a switch block is better than a goto label statement; which is that human programmers are error-prone.

protected variables lend themselves to un-intentional abuse (programmer mistakes), just as the goto statement lends itself to the creation of spaghetti code.

Is it possible to write working bug-free code using protected class variables? Yes of course! Just as it's possible to write working bug-free code using goto; but as the cliche goes "Just because you can, doesn't mean you should!"

Classes, and indeed the OO paradigm, exist to guard against hapless error-prone human programmers making mistakes. The defense against human mistakes is only as good as the defensive measures built into the class. Making the implementation of your class protected is the equivalent of blowing an enormous hole in the walls of a fortress.

Base classes have absolutely no knowledge of derived classes. As far as a base class is concerned, protected does not actually give you any more protection than public, because there's nothing stopping a derived class from creating a public getter/setter which behaves like a backdoor.

If a base class permits un-hindered access to its internal implementation details, then it becomes impossible for the class itself to defend against mistakes. Base classes have absolutely no knowledge of their derived classes, and therefore have no way of guarding against mistakes made in those derived classes.

The best thing a base class can do is hide as much of its implementation as possible as private and put enough restrictions in place to guard against breaking changes from derived classes or anything else outside of the class.

Ultimately, high-level languages exist to minimise human errors. Good programming practises (such as SOLID principles) also exist to minimise human errors.

Software developers who ignore good programming practices have a much higher chance of failure, and are more likely to produce broken unmaintainable solutions. Those who follow good practices have a much lower chance of failure, and are more likely to produce working maintainable solutions.

Ben Cottrell
  • 12,133
4

Inheritable classes have two contracts--one with holders of object references, and with derived classes. Public members are bound by the contract with reference holders, and protected members are bound by the contract with derived classes.

Making members protected makes it a more versatile base class, but will often limit the ways in which future versions of the class might change. Making members private allows the class author more versatility to change the inner workings of the class, but limits the kinds of classes that can be usefully derived from it.

As an example, List<T> in .NET makes the backing store private; if it were protected, derived types could do some useful things that are otherwise not possible, but future versions of List<T> would forevermore have to use its clunky monolithic backing store even for lists holding millions of items. Making the backing store private would allow future versions of List<T> to use a more efficient backing store without breaking derived classes.

supercat
  • 8,629
4

I think there is a key assumption in your argument that when someone writes a class they don't know who might extend that class down the road and for what reason. Given this assumption your argument would make perfect sense because every variable you make private then could potentially cut off some avenue of development down the road. However, I would reject that assumption.

If that assumption is rejected then there are only two cases to consider.

  1. The author of the original class had very clear ideas for why it might be extended (e.g. it is a BaseFoo and there will be several concrete Foo implementations down the road).

In this case, the author knows that someone will be extending the class and why and therefore will know exactly what to make protected and what to make private. They are using the private/protected distinction to communicate an interface of sorts to the user creating the subclass.

  1. The author of the child class is trying to hack in some behavior into a parent class.

This case should be rare (you could argue it isn't legitimate), and is not preferred to just modifying the original class in the original code base. It could also be a symptom of bad design. In those cases I would prefer the person hacking in the behavior just use other hacks like friends (C/C++) and setAccessible(true) (Java).

I think it is safe to reject that assumption.

This generally falls back to the idea of composition over inheritance. Inheritance is often taught as an ideal way to reduce code reuse however it should rarely be the first choice for code reuse. I don't have a simple knock-down argument and it can be a fairly difficult and contentious to understand. However, in my experience with domain modeling I have found that I rarely use inheritance without having a very clear understanding of who will be inheriting my class and why.

Pace
  • 591
3

All three access levels have their use case, OOP would be incomplete lacking any of them. Usually you do

  • make all variables/data members private. You don't want someone from outside to mess with your internal data. Also methods that provide auxiliary functionality (think calculations based on several member variables) to your public or protected interface - this is only for internal use, and you might want to change/improve it in the future.
  • make the general interface of your class public. That's what the users of your original class are supposed to work with, and how you think derived classes should look like, too. In order to provide proper encapsulation these are usually only methods (and helper classes/structs, enums, typedefs, whatever the user needs to work with your methods), not variables.
  • declare the methods protected that could be of use for someone who wants to extend/specialize the functionality of your class, but should not be part of the public interface - in fact you usually raise private members to protected when necessary. If in doubt you don't, until you know that
    1. your class can/may/will be subclassed,
    2. and have a clear idea what the use cases of subclassing may be.

And you deviate from this general scheme only if there's a good reason™. Beware of "this will make my life easier when I can freely access it from outside" (and outside here also includes subclasses). When I implement class hierarchies I often start with classes that don't have protected members, until I come to subclassing/extending/specializing them, becoming the base classes of a framework/toolkit and sometimes moving part of their original functionality one level up.

Murphy
  • 831
1

A more interesting question, perhaps, is why any other type of field than private is necessary. When a subclass needs to interact with the data of a superclass, doing so directly creates a direct coupling between the two, whereas using methods to provide for the interaction between the two allows a level of indirection that can make it possible to make changes to the superclass that would otherwise be very difficult.

A number of languages (e.g. Ruby and Smalltalk) do not provide public fields so that developers are discouraged from allowing direct coupling to their class implementations, but why not go further and only have private fields? There would be no loss of generality (because the superclass can always provide protected accessors for the subclass), but it would ensure that classes always have at least a small degree of isolation from their subclasses. Why is this not a more common design?

Jules
  • 17,880
  • 2
  • 38
  • 65
1

A lot of good answers here, but I'll throw in my two cents anyway. :-)

Private is good for the same reason that global data is bad.

If a class declares data private, then you absolutely know that the only code messing with this data is the code in the class. When there's a bug, you don't have to search all over creation to find every place that might change this data. You know it's in the class. When you make a change to the code, and you change something about how this field is used, you don't have to track down all the potential places that might use this field and study whether your planned change will break them. You know the only places are inside the class.

I have had many, many times that I have had to make changes to classes that are in a library and used by multiple apps, and I have to tread very carefully to make sure I don't break some app that I know nothing about. The more public and protected data there is, the more potential for trouble.

Jay
  • 2,687
1

I think it's worth to mention some dissenting opinions.

In theory, it's good to have controlled access level for all the reasons mentioned in other answers.

In practice, too often when code reviewing, I see people (who like to use private), changing access level from private -> protected and not too often from protected -> public. Almost always, changing class properties involve modifying setters/getters. These have wasted much of my time (code review) and theirs (changing code).

It also annoys me that that means their classes are not Closed for modification.

That was with internal code where you can always change it if you need too. The situation is worse with 3rd party code when it's not so easy to change code.

So how many programmers think it's a hassle? Well, how many are using programming languages that don't have private? Of course, people are not just using those languages because they don't have private specifiers, but it helps to simplify the languages and simplicity is important.

Imo it's very similar to dynamic/static typing. In theory, static typing is very good. In practice, it only prevents like 2% of errors The Unreasonable Effectiveness of Dynamic Typing .... Using private probably prevents error less than that.

I think SOLID principles are good, I wish people care about them more than they care about creating a class with public, protected and private.

imel96
  • 3,608
0

I'd also like to add another practical example of why protected is not enough. At my university the first years undertake a project where they have to develop a desktop version of a board game (that later an AI is developed for and it is connected to other players over a network). Some partial code is provided to them to get them started including a testing framework. Some of the properties of the main game class are exposed as protected so that the test classes that extend this class have access to them. But these fields aren't sensitive information.

As a TA for the unit I often see students simply making all their added code protected or public (perhaps because they saw the other protected and public stuff and assumed they should follow suit). I ask them why their protection level is inappropriate, and many don't know why. The answer is that the sensitive information that they are exposing to subclasses means that another player can cheat by simply extending that class and accessing the highly sensitive information for the game (essentially the opponents hidden location, I guess similar to how it would be if you could see your opponents pieces on a battleships board by extending some class). That makes their code very dangerous in the context of the game.

Other than that, there are many other reasons to keep something private to even your subclasses. It might be to hide implementation details that could mess up the correct working of the class if changed by someone who doesn't necessarily know what they are doing (mostly thinking about other people using your code here).

J_mie6
  • 109
0

Private methods/variables will generally be hidden from a subclass. That can be a good thing.

A private method can make assumptions about parameters and leave sanity checking to the caller.

A protected method should sanity check inputs.

-1

One of the first things I do when I subclass a class is to change a bunch of private methods to protected

Some reasoning about private vs. protected methods:

private methods prevent code reuse. A subclass cannot use the code in the private method and may have to implement it again - or re-implement the method(s) which originally depend on the private method &c.

On the other hand, any method which is not private can be seen as an API provided by the class to "the outer world", in the sense that third-party subclasses are considered "outer world" too, as someone else suggested in his answer already.

Is that a bad thing? - I don't think so.

Of course, a (pseudo-)public API locks the original programmer up and hinders refactoring of those interfaces. But seen the other way around, why should a programmer not design his own "implementation details" in a way that's as clean and stable as his public API? Should he use private so that he can be sloppy about structuring his "private" code? Thinking maybe that he could clean it up later, because no one will notice? - No.

The programmer should put a little thought into his "private" code too, to structure it in a way that allows or even promotes reuse of as much of it as possible in the first place. Then the non-private parts may not become as much of a burden in the future as some fear.

A lot of (framework) code I see adopts an inconsistent use of private: protected, non-final methods which barely do anything more than delegating to a private method is commonly found. protected, non-final methods whose contract can only be fulfilled through direct access to private fields too.

These methods cannot logically be overridden/enhanced, although technically there's nothing there to make that (compiler-)obvious.

Want extendability and inheritance? Don't make your methods private.

Don't want certain behavior of your class altered? Make your methods final.

Really cannot have your method called outside of a certain, well-defined context? Make your method private and/or think about how you can make the required well-defined context available for reuse through another protected wrapper method.

That's why I advocate to use private sparingly. And to not confuse private with final. - If a method's implementation is vital to the general contract of the class and thus is must not be replaced/overridden, make it final!

For fields, private is not really bad. As long as the field(s) can be reasonably "used" via appropriate methods (that's not getXX() or setXX()!).

JimmyB
  • 153
-1

"private" means: Not intended to be changed or accessed by anyone except the class itself. Not intended to be changed or accessed by subclasses. Subclasses? What subclasses? You are not supposed to subclass this!

"protected" means: Only intended to be changed or accessed by classes or subclasses. Likely deduction that you are supposed to subclass, otherwise why "protected" and not "private"?

There's a clear difference here. If I make something private, you are supposed to keep your dirty fingers off it. Even if you are a subclass.

gnasher729
  • 49,096
-1

Do you know about an important use case where private instead of protected is a good tool, or would two options "protected & public" be enough for OOP languages?

Private: when you have something that will never be useful for any subclass to call or override.

Protected: when you have something that has a subclass-specific implementation/constant.

An example:

public abstract Class MercedesBenz() extends Car {
  //Might be useful for subclasses to know about their customers
  protected Customer customer; 

  /* Each specific model has its own horn. 
     Therefore: protected, so that each subclass might implement it as they wish
  */
  protected abstract void honk();

  /* Taken from the car class. */
  @Override
  public void getTechSupport(){
     showMercedesBenzHQContactDetails(customer);
     automaticallyNotifyLocalDealer(customer);
  }

  /* 
     This isn't specific for any subclass.
     It is also not useful to call this from inside a subclass,
     because local dealers only want to be notified when a 
     customer wants tech support. 
   */
  private void automaticallyNotifyLocalDealer(){
    ...
  }
}
-2

It was hard for me to understand this matter, so I'd like to share a piece of my experience :

  • What is the protected field? It's nothing more than a field, that can't be accessed outside a class, i.e. publically like this: $classInstance->field. And the trick that it's "this is it". Your class' childrens will have a full access to it, because it's their rightful internal part.
  • What is the private field? It's a "true private" for your very own class and your very own implementation of this class. "Keep out of reach of the children", just like on a medicine's bottle. You will have a guarantee that it's unoverridable by your class' derivatives, your methods - when called - will have exact what you've declared

UPDATE: a practical example by a real task I've solved. Here it is : you have a token, like USB or LPT(that was my case), and you have a middleware. The token asks you for a pincode, opens up if it's correct and you can send encrypted part and a number of key to decipher. The keys are stored in token, you can not read them, only use them. And there were temporary keys for a session, signed by a key in a token, but stored in a middleware itself. The temp key were not supposed to leak eveywhere outside, just to exist on a driver level. And I used a private fields to store this temporary key and some hardware-connection-related data. So no derivatives were able to use not just a public interface, but also some protected "handy" subroutines I've made for a task, but were unable to open a strongbox with the keys and HW interaction. Makes sense?