7

How can uniformly and deterministically distribute set of GUIDS to N buckets.

  1. N can be as small as 2.
  2. Need to make sure the same GUID is always mapped to the same bucket.
  3. Can't use any additionally memory.
  4. Input GUID set is not known in advance, and will be generated using common library functions available in standard libraries available in c#, java etc.

BucketId = GUID random part % N, would satisfy the consistency part, but, I don't think it will be uniform or not.

1 Answers1

8

(Note: you probably mean "to N buckets". Or "to a group of buckets of size N".)

"GUID random part % N" is the most uniform you can ever hope for.

Lack of uniformity will only be evident in a small data set, in which performance does not matter anyway. In a large data set, where performance matters, it will be quite uniform.

Of course, these are random numbers that we are talking about, so absolute uniformity (all buckets having the exact same load) is practically impossible.

So, when you speak of uniformity, and knowing that you cannot have absolute uniformity, you must be willing to accept some "good enough" uniformity. Which in turn means that you must have some particular uniformity requirements in mind. Please tell us your requirements, and why you arrived at them.

What, you don't have any such requirements? You are just worried before you have an actual problem in your hands? Well then, I would recommend that you stop worrying about the uniformity offered by "GUID random part % N". In all likelihood it will be much better than what you would ever need.

(And, in any case, unbeatable.)

Mike Nakis
  • 32,803