23

I have been adapting domain-driven design for about 8 years now and even after all these years, there is still one thing, that has been bugging me. That is checking for a unique record in data storage against a domain object.

In September 2013 Martin Fowler mentioned the TellDon'tAsk principle, which, if possible, should be applied to all domain objects, which should then return a message, how the operation went (in object-oriented design this is mostly done through exceptions, when the operation was unsuccessful).

My projects are usually divided into many parts, where two of them are Domain (containing business rules and nothing else, the domain is completely persistence-ignorant) and Services. Services knowing about repository layer used to CRUD data.

Because uniqueness of an attribute belonging to an object is a domain/business rule, it should be long to domain module, so the rule is exactly where it is supposed to be.

In order to be able to check the uniqueness of a record, you need to query current dataset, usually a database, to find out, whether another record with a let's say Name already exists.

Considering domain layer is persistence ignorant and has no idea how to retrieve the data but only how to do operations on them, it cannot really touch the repositories itself.

The design I have been then adapting looks like this:

class ProductRepository
{
    // throws Repository.RecordNotFoundException
    public Product GetBySKU(string sku);
}

class ProductCrudService
{
    private ProductRepository pr;

    public ProductCrudService(ProductRepository repository)
    {
        pr = repository;
    }

    public void SaveProduct(Domain.Product product)
    {
        try {
            pr.GetBySKU(product.SKU);

            throw Service.ProductWithSKUAlreadyExistsException("msg");
        } catch (Repository.RecordNotFoundException e) {
            // suppress/log exception
        }

        pr.MarkFresh(product);
        pr.ProcessChanges();
    }
}

This leads to having services defining domain rules rather than the domain layer itself and you having the rules scattered across multiple sections of your code.

I mentioned the TellDon'tAsk principle, because as you can clearly see, the service offers an action (it either saves the Product or throws an exception), but inside the method you are operation on objects through using procedural approach.

The obvious solution is to create a Domain.ProductCollection class with an Add(Domain.Product) method throwing the ProductWithSKUAlreadyExistsException, but it's lacking in performance a lot, because you would need to obtain all the Products from data storage in order to find out in code, whether a Product already has the same SKU as the Product you are trying to add.

How do you guys solve this specific issue? This is not really a problem per se, I have had service layer represent certain domain rules for years. The service layer usually also serves more complex domain operations, I am simply wondering whether you have stumbled upon a better, more centralized, solution during your career.

blunova
  • 398
Andy
  • 10,400

2 Answers2

13

Considering domain layer is persistence ignorant and has no idea how to retrieve the data but only how to do operations on them, it cannot really touch the repositories itself.

I would disagree with this part. Especially the last sentence.

While it is true that domain should be persistence ignorant, it does know that there is "Collection of domain entities". And that there are domain rules that concern this collection as a whole. Uniqueness being one of them. And because the implementation of the actual logic heavily depends on specific persistence mode, there must be some kind of abstraction in the domain that specifies need for this logic.

So it is as simple as creating an interface that can query if name already exists, which is then implemented in your data store and called by whoever needs to know if the name is unique.

And I would like to stress out that repositories are DOMAIN services. They are abstractions around persistence. It is the implementation of repository which should be separate from the domain. There is absolutely nothing wrong with domain entity calling a domain service. There is nothing wrong with one entity being able to use repository to retrieve another entity or retrieve some specific information, that cannot be readily kept in memory. This is a reason why Repository is key concept in Evans' book.

Euphoric
  • 38,149
4

You need to read Greg Young on set validation.

Short answer: before you go too far down the rats nest, you need to make sure that you understand the value of the requirement from the business perspective. How expensive is it, really, to detect and mitigate the duplication, rather than preventing it?

The problem with “uniqueness” requirements is that, well, very often there’s a deeper underlying reason why people want them -- Yves Reynhout

Longer answer: I've seen a menu of possibilities, but they all have tradeoffs.

You can check for duplicates before sending the command to the domain. This can be done in the client, or in the service (your example shows the technique). If you aren't happy with the logic leaking out of the domain layer, you can achieve the same sort of result with a DomainService.

class Product {
    void register(SKU sku, DuplicationService skuLookup) {
        if (skuLookup.isKnownSku(sku) {
            throw ProductWithSKUAlreadyExistsException(...)
        }
        ...
    }
}

Of course, done this way the implementation of the DeduplicationService is going to need to know something about how to look up the existing skus. So while it pushes some of the work back into the domain, you are still faced with the same basic problems (needing an answer for the set validation, problems with race conditions).

You can do the validation in your persistence layer itself. Relational databases are really good at set validation. Put a uniqueness constraint on the sku column of your product, and you are good to go. The application just saves the product into the repository, and you get a constraint violation bubbling back up if there is a problem. So the application code looks good, and your race condition is eliminated, but you've got "domain" rules leaking out.

You can create a separate aggregate in your domain that represents the set of known skus. I can think of two variations here.

One is something like a ProductCatalog; products exist somewhere else, but the relationship between products and skus is maintained by a catalog that guarantees sku uniqueness. Not that this implies that products don't have skus; skus are assigned by a ProductCatalog (if you need skus to be unique, you achieve this by having only a single ProductCatalog aggregate). Review the ubiquitous language with your domain experts -- if such a thing exists, this could well be the right approach.

An alternative is something more like a sku reservation service. The basic mechanism is the same: an aggregate knows about all of the skus, so can prevent the introduction of duplicates. But the mechanism is slightly different: you acquire a lease on a sku before assigning it to a product; when creating the product, you pass it the lease to the sku. There's still a race condition in play (different aggregates, therefore distinct transactions), but it's got a different flavor to it. The real downside here is that you are projecting into the domain model a leasing service without really having a justification in the domain language.

You can pull all products entities into a single aggregate -- ie, the product catalog described above. You absolutely get uniqueness of the skus when you do this, but the cost is additional contention, modifying any product really means modifying the entire catalog.

I don't like the need to pull all product SKUs out of the database to do the operation in-memory.

Maybe you don't need to. If you test your sku with a Bloom filter, you can discover many unique skus without loading the set at all.

If your use case allows you to be arbitrary about which skus you reject, you could punt away all of the false positives (not a big deal if you allow the clients to test the skus they propose before submitting the command). That would allow you to avoid loading the set into memory.

(If you wanted to be more accepting, you could consider lazy loading the skus in the event of a match in the bloom filter; you still risk loading all the skus into memory sometimes, but it shouldn't be the common case if you allow the client code to check the command for errors before sending).

VoiceOfUnreason
  • 34,589
  • 2
  • 44
  • 83