Patterns for maintaining consistency in a distributed, event sourced system?

Question

I've been reading on event sourcing lately and really like the ideas behind it but am stuck with the following problem.

Let's say you have N concurrent processes which receive commands (e.g. web servers), generate events as a result and store them in a centralised store. Let's also assume that all transient application state is maintained in the memory of the individual processes by sequentially applying events from the store.

Now, let's say we have the following business rule: each distinct user must have a unique username.

If two processes receive a user registration command for the same username X, they both check that X isn't in their list of usernames, the rule validates for both processes and they both store a "new user with username X" event in the store.

We have now entered an inconsistent global state because the business rule is violated (there are two distinct users with the same username).

In a traditional N server <-> 1 RDBMS style system, the database is used as a central point of synchronisation which helps prevent such inconsistencies.

My question is: how do event sourced systems typically approach this problem? Do they simply process every commands sequentially (e.g. limit the amount of process which can write to the store to 1)?

VoiceOfUnreason · Answer 1 · 2017-08-29T14:10:17.493

In a traditional N server <-> 1 RDBMS style system, the database is used as a central point of synchronisation which helps prevent such inconsistencies.

In event sourced systems, the "event store" serves the same role. For an event sourced object, your write is an append of your new events to a particular version of the event stream. So, just as with concurrent programming, you could acquire a lock on that history when processing the command. It's more common for event sourced systems to take a more optimistic approach -- load the previous history, calculate the new history, then compare-and-swap. If some other command has also written to that stream, then your compare and swap fails. From there, you either rerun your command, or abandon your command, or perhaps even merge your results into the history.

Contention becomes a major problem if all N servers with their M commands are trying to write into a single stream. The usual answer here is to allocate a history to each event sourced entity in your model. So User(Bob) would have a distinct history from User(Alice), and writes to one won't block writes to the other.

My question is: how do event sourced systems typically approach this problem? Do they simply process every commands sequentially?

Greg Young on Set Validation

Is there an elegant way to check unique contraints on domain object attributes without moving business logic into service layer?

Short answer, in many cases, investigating that requirement more deeply reveals that either (a) it's a poorly understood proxy for some other requirement, or (b) that violations of the "rule" are acceptable if they can be detected (exception report), mitigated within some time window, or are low frequency (eg: clients can check if a name is available before dispatching a command to use it).

In some cases, where your event store is good at set validation (ie: a relational database), then you implement the requirement by writing to a "unique names" table in the same transaction that persists the events.

In some cases, you can only enforce the requirement by having all of the user names published into the same stream (which allows you to evaluate the set of names in memory, as part of your domain model). -- In this case, two processes will update attempt to update "the" stream history, but one of the compare-and-swap operations will fail, and the retry of that command will be able to detect the conflict.

SemanticBeeng · Answer 2 · 2018-11-20T18:10:11.847

Sounds like you could implement a business process (saga in context of Domain Driven Design) for the user registration where the user is treated like a CRDT.

Resources

https://doc.akka.io/docs/akka/current/distributed-data.html http://archive.is/t0QIx
"CRDTs with Akka Distributed Data" https://www.slideshare.net/markusjura/crdts-with-akka-distributed-data to learn about
- CmRDTs - operation based CRDTs
- CvRDTs - state based CRTDs
Code examples in Scala https://github.com/akka/akka-samples/tree/master/akka-sample-distributed-data-scala. Maybe "shopping cart" is most suitable.
Tour of Akka Cluster – Akka Distributed Data https://manuel.bernhardt.io/2018/01/03/tour-akka-cluster-akka-distributed-data/

Patterns for maintaining consistency in a distributed, event sourced system?

2 Answers2