Java instanceof and Clean architecture

Question

It seems to me that there is a conflict between clean architecture and the recommendation not to use instanceof. Consider the following code:

class ParentEntity {
}

class AEntity extends ParentEntity {
    List<Integer> entityIds;
}

class SaveUseCase {
    IEntityRepository entityRepository;
void save(List&lt;ParentEntity&gt; list) {
    entityRepository.save(list);
}

}

class EntityRepository implements IEntityRepository {
    void save(List<ParentEntity> list) {
        list.forEach(e -> {
            if (e instanceOf AEntity)
                validate((AEntity) e)

            // Do save e in the database
            ...
        }
    }

    void validate(AEntity a) {
        List<ParentEntity> list = a.getEntityIds().stream().map(id -> get(id))
        // Do some checking based on the value of list
        ...
    }

    ParentEntity get(int id) {
        // Do get ParentEntity with identifier id from the database
        ...
    }

}

The code has a usecase which calls the save method in the repository. The save method first checks the object only if the object is of type AEntity and then saves the object.

The problem is the use of instanceof in the save method of EntityRepository. If we want to prevent using instanceof, one solution is to make validate a method of ParentEntity and do the validation inside AEntity by overriding it. However, according to the clean architecture we have separated the entities and repositories, so inside entities we do not have access to get method of the repository, which is required for being able to do the validation.

The workaround to this is to put a reference to IEntityRepository (or at least something like GetUseCase) inside the entity so it can do the validation itself. But, this doesn't seem a very good idea to me, especially if we assume that validation is a logic of the repository and is there only to check, e.g., what other layers give to it as parameters are valid.

So, using clean architecture biases us to using instanceof and using it is not bad in scenarios like the one I mentioned. Am I right or am I misunderstanding something?

Update: I quote some sentences from here, that I think are related to my point of view:

Some forms of validation are more efficient at the database layer, especially when referential integrity checks are needed (e.g. to ensure that a state code is in the list of 50 valid states).

Some forms of validation must occur in the context of a database transaction due to concurrency concerns, e.g. reserving a unique user name has to be atomic so some other user doesn't grab it while you are processing.

I have seen some developers try to codify all the validation rules in the business layer, and then have the other layers call it to extract the business rules and reconstruct the validation at a different layer. In theory this would be great because you end up with a single source of truth. But I have never, ever seen this approach do anything other than needlessly complicate the solution, and it often ends very badly.

amon · Answer 1 · 2021-08-03T11:01:01.277

If you want to add a method to a class hierarchy without actually adding the method, consider the Visitor Pattern. You could create a validation visitor, and let each entity select the appropriate method of the visitor.

First, your ParentEntity class hierarchy would need a bit of boilerplate to support visitors:

interface EntityVisitor<T> {
  T visitA(AEntity a);
  T visitB(BEntity b);
}
class ParentEntity {
  <T> T accept(EntityVisitor<T> v);
}
class EntityA extends ParenEntity {
  ...
  @Override <T> T accept(EntityVisitor<T> v) {
    return v.visitA(this);
  }
}

Next, we can implement and use a visitor that performs validation.

class Validation implements EntityVisitor<Void> {
  EntityRepository repository;
  ...
  @Override Void visitA(AEntity a) { ... }
  @Override Void visitB(BEntity b) { ... }
}
class EntityRepository ... {
  void save(List<ParentEntity> list) {
    list.ForEach(e -> {
      e.accept(new Validation(this));
      ...
    });
  }
}

The validation visitor can have access to both the entity and the repository (in order to make further queries), and will therefore be able to perform the full validation.

Using such a pattern has advantages and disadvantages compared to an instanceof check and compared to moving the validation logic into the entities.

An instanceof is a much simpler solution, especially if you only have very few entity types. However, this could silently fail if you add a new entity type. In contrast, the visitor pattern will fail to compile until the accept() method is implemented in the new entity. This safety can be valuable.
While this pattern ends up having the same behaviour as adding a validate() method to the entities, an important difference is where that behaviour is located and how our dependency graph looks. With a validate() method, we would have a dependency from the entities to the repository, and would have referential integrity checks intermixed with actual business logic. This defeats the point of an Onion Architecture. The visitor pattern lets us break this dependency and lets us keep the validation logic separate from other business logic. The cost of this clearer design structure is extra boilerplate in the form of the EntityVisitor interface and the accept() method that must be added to all entities in the relevant class hierarchy.

Whether these trade-offs are worth it is your call. You know your codebase best, and you have the best idea how it might evolve.

However, performing validation based on the result of multiple queries can lead to data integrity problems. The repository should either make sure to use database transactions (and offer an API that clearly communicates when modifications have been committed), or the relevant integrity checks should be done within the database, e.g. using constraints in an SQL database. In some cases, the validation checks can also be expressed as part of an insert or update query.

score 2 · Answer 2 · answered Aug 02 '21 at 18:11

I know what I am about to answer is not exactly good practice but if you want to avoid instanceof's and have a generic way to call the respective method for that subclass you could use reflection:

Method m = EntityRepository.class.getMethod("validate", e.getClass());
m.invoke(this, e);

Of course, this will have a negative effect on performance and in some ways maintainability (with the only upside being less code).

Regarding the performance overhead, you can somewhat mitigate it by loading all the methods at startup:

Map<Class<?>, Method> methods = new HashMap<>();
Reflections reflections = new Reflections(ParentEntity.class, new SubTypesScanner());
Set<Class<? extends Animal>> subclasses = reflections.getSubTypesOf(ParentEntity .class);
for (Class<?> c : subclasses) {
    methods.put(c, Solution.class.getMethod("makeTalk",c));
}

Where Reflections comes from the Reflections library

And then just call the method using:

methods.get(e.getClass()).invoke(this, e)

score 1 · Answer 3 · answered Aug 03 '21 at 06:36

If EntityRepository already knows about the whole inheritance hierarchy and is already implemented in a non-generic fashion (as you wrote in a comment), I see absolutely no point in avoiding instanceOf. In case EntityRepository contains different, individual validations methods validate(AEntity a), validate(BEntity b) and validate(CEntity c), you have to write a new validation method either when a new child class DEntity arrives.

Of course, one could try here to create a generic validate(ParentEntity p) method, applicable to all kind of child classes. But how this methods should look like is pretty hard to tell by seeing only one example for a child class of ParentEntity. I would usually look at least at three different child classes, look which parts the of validation can ge generalized, which parts are different, and refactor the different parts out into the child classes (reachable through virtual methods).

score 1 · Answer 4 · edited Jan 11 '23 at 08:58

An instanceof solution has its own downsides, detailed among others on Why not use instanceof operator in OOP design? too, with regards to SOLD principles¹, namely the O and L principles, thus its loftier readability affecting its maintainability that could be improved refactoring the validation concern to its own class and changing the constructors' pseudo code so it registers validators for classes to validate leveraging the validators' dynamic selection without using instanceof operator.

public class EntityRepository {
private ValidatorRepository validationRepository;

public EntityRepository() {
    registerValidators();
}

private void registerValidators() {
    Function&lt;Entity, Boolean&gt; validator = (entity) -&gt; {

        List&lt;Entity&gt; list = entity.getEntityIds().stream().map(id -&gt; get(id)).collect(Collectors.toList());
        boolean validationResult = false;

        // Do some checking based on the value of list
        // ...

        return validationResult;
    };
    validationRepository.registerValidator(Entity.class, validator);
}

public void save(List&lt;? extends Entity&gt; entities) {

        entities.stream()
                .filter(entity -&gt; validationRepository.getValidator(entity.getClass())
                                                      .orElseGet(() -&gt; (e) -&gt; true)
                                                      .apply(entity) )
                .forEach(entity -&gt; {

                     // Do save entity in the database
                     // ...

                 });
}

private Entity get(int id) {
    // Do get ParentEntity with identifier id from the database
    return null;
}

}
public class ValidatorRepository {
private Map&lt;Class&lt;? extends Entity&gt;, Function&lt;Entity, Boolean&gt;&gt; validators;

public ValidatorRepository() {
    this.validators = new HashMap&lt;Class&lt;? extends Entity&gt;, Function&lt;Entity, Boolean&gt;&gt;();
}

public void registerValidator(Class&lt;? extends Entity&gt; classToValidate, Function&lt;Entity, Boolean&gt; validator) {
    validators.put(classToValidate, validator);
}

public Optional&lt;Function&lt;Entity, Boolean&gt;&gt; getValidator(Class classToValidate) {
    return Optional.ofNullable(validators.get(classToValidate));
}

}

(1) I intentionally omitted. One might argue that the Interface Segregation falls keener under the definition of a rule than the definition of a principle, hence SOLD instead of SOLID principles.

JimmyJames supports Canada · Answer 5 · 2021-08-02T21:06:19.097

I was hoping to get some clarification from my comment on your question but here's a few options you might what to consider. I think the most straightforward and least disruptive (to your design) option is to simply to add validate to ParentEntity like this:

class ParentEntity {
  public void validate(EntityRepository repo) {
    return;
  }
}

I'm going to assume there's something else going on in ParentEntity that you haven't shown us. Otherwise I would question why it's a class and not an interface.

This might violate one of Uncle Bob's rules (I don't know and I don't really much care) by exposing the repository to the entity instances. Regardless of that, I don't love it. It seems a little loose but on the other hand, it adds a lot of flexibility.

If that's a concern for you, you could simply pass the related entities to the ParentEntity. In order to make that work without an instanceof you could do something like this:

class ParentEntity {
  public List<Integer> getEntityIds() {
    return Collections.emptyList();
  }
public void validate(List<ParentEntity> associated) {
    return;
  }
}

And override getEntityIds() in AEntity. This is maybe a little better but still seems a little off to me. What I would probably do is a deeper redesign. You don't show where you decide to create an AEntity instead of a ParentEntity but wherever that happens, you would (naïvely) instantiate AEntity with references to the related entities. I say naïvely here because there's an immediate problem. If you can't instantiate a AEntity without it's dependencies, you can easily run into a cycle if an AEntity depends on other AEntity instances. Even if there never cycles, it can be a little complicated to instantiate them in the right order. I can think of few ways out of that mess.

Lazy loaded entities. Each entity is initialized with it's id only. When details are required, then they are retrieved.
Instantiate entities with a list of function references, each of which retrieves the actual entity when it's needed during validation.
Instantiate the AEntity with a list of strategy pattern objects. This is similar to 1, except the entities are not lazy-loaded necessarily. The strategy object will get them after the entities are created.

If you are interested in any of these three options, let me know and I can elaborate but I'm not going to spend time on that unless you are expressly interested.

score 0 · Answer 6 · answered Aug 03 '21 at 11:44

There is a hierarchical entity class hierarchy; ParentEntity best named (Base/Abstract)Entity or such.

The repository class manages many entity classes. It probably would be better to have <E extends ParentEntity> because the API usages now causes a lot of casts at usage points: Stream maps, instanceof, the new instanceof (java 14):

        if (e instanceOf AEntity ae) {
            validate(ae);
        }

Such a moloch being handling such a diversity is not good code either: from the outside there is no strictness: is entity FooEntity validating or not? You would need to read maybe hundreds of lines.

When validation is in general use for entities, but you do not want an overridable validate in the entity, you should use a strategy pattern, to map the entity class to validation handlers.

score 0 · Answer 7 · edited Jan 13 '23 at 19:59

While there are multiple solutions addressing various programming paradigms - e.g. functional programming, OOP, and procedural programming - to avoid usage of instanceof operator the reasons to consider before implementing a solution should be about the concern to be implemented that for the validation concern are whether it should be compulsory - are there entities eligible for skipping validation? - and whether it should be a concern on its own - could the validation be mingled with the persistence concern? - or not.

Following solution is a controversial implementation of strategy design pattern for optional validation concern. Controversial in the sense that enforces the way of usage, for this specific example the right usage being call supports method then call validate method only if supports method returns true.

public class EntityRepository {
private ValidatorRepository validationRepository;

public EntityRepository() {
    registerValidators();
}

public &lt;T extends Entity&gt; void save(List&lt;T&gt; entities) {

    for (T entity : entities) {
        Validator&lt;T&gt; validator = validationRepository.getValidator(entity.getClass());
        if ( validator == null 
             || validator.validate(entity) ) {
            // Do save entity in the database
        }
    }
}

private void registerValidators() {

    Validator&lt;AEntity&gt; validator = new Validator&lt;AEntity&gt;() {

        @Override
        public boolean supports(Class classToValidate) {

            return AEntity.class.getName().equals(classToValidate.getName());
        }

        @Override
        public boolean validate(AEntity toValidate) {
            List&lt;AEntity&gt; entities = new ArrayList&lt;AEntity&gt;();

            for (int id : toValidate.getEntityIds()) {
                entities.add(get(id));
            }

            boolean validationResult = false;
            // Do some checking based on the value of list
            return validationResult;
        }
    };

    validationRepository.registerValidator(validator);
}

private AEntity get(int id) {
    // Do get ParentEntity with identifier id from the database
    return null;
}

}

A solution that introduces following validation specific class and interface.

public class ValidatorRepository {
private Set&lt;Validator&gt; validators;

public boolean registerValidator(Validator validator) {

    return validators.add(validator);
}

public Validator getValidator(Class forClass) {

    Validator toReturn = null;

    for ( Validator validator : this.validators) {
        if ( validator.supports(forClass)) {
            toReturn = validator;
            break;
        }
    }

    return toReturn;
}

}
public interface Validator<T> {
boolean supports(Class&lt;T&gt; classToValidate);

boolean validate(T toValidate);

}

score 0 · Answer 8 · answered Dec 24 '21 at 11:06

From my perspective you have a design problem here.

The problem you are facing could easily be circumvented if you take validation, which is from my POV a separate concern out of the repository layer - esp. out of the save() method which now does two things -namely validation and persistence- instead of one.

save() should just save. Whether or not the objects need to validate before should be handled earlier. There should be an implicit contract that ensures: everything which enters the persistence layer, should be dealt with as is, i.e. every object is in a good / valid state.

Where you deal with validation is up to you. For example you could use a Builder which ensures only valid instances are created.

By calling the validation logic before the actual object is created, you can be guaranteed that every Book instance created by the Builder has a valid state.

And your problem vanishes.

score 0 · Answer 9 · answered Jan 14 '23 at 12:08

You have conflated two separate paradigms.

Object Oriented Programming.

Principle. Methods go on the object with the data, Don't use reflection or instanceof to work out the type because we have provided inheritance and overridden methods to have various different functions run on different objects.

Its normal to have a list of objects and not know the type of each, call object.validate() on each and the object will run the correct function.
Clean Architecture

Principle. Methods go on use case objects, not the 'entity' so that we can better separate concerns. Don't have entities inherit from each other or a base class because we always need to know what type of entity we are dealing with in order to run the correct usecase.

Its not normal to have a list of entities of different types. They don't share a base class. If for some reason you do, its fine to use instance of, because you don't want to put the method on the entity and so cant use overrides.

The contradiction isnt between "dont use instanceof" and "clean architecture". its between "clean architecture" and "OOP"

Java instanceof and Clean architecture

9 Answers9

From my perspective you have a design problem here.