21

For storing a 128 bits UUID there are multiple storage options:

  1. a byte[16] column
  2. two bigint/long(64 bits) columns
  3. a CHAR(36) column - 32 hex digits + 4 dashes.
  4. a UUID database specific column, if db supports it

From an indexing point of view which of those are the most efficient? If the db doesn't support a dedicated uuid type which of 1, 2, 3 are the best candidates?

Vlad Mihalcea
  • 917
  • 3
  • 9
  • 23

4 Answers4

22

A dedicated uuid type is your best bet for PostgreSQL. Hard to say with other DBs - it's not impossible for someone to impliment a uuid type that's stored less efficiently than a simple byte type.

Again in PostgreSQL, bytea would be a reasonable way to store UUIDs if you didn't have the uuid type. For other DBs it depends on how they store binary data.

Where possible I'd strongly avoid using hex-with-dashes. It's way less efficient to compare, sort, and store.

So really, "not (2) or (3)". Ever. Use (4) where supported, (1) otherwise.

Craig Ringer
  • 57,821
  • 6
  • 162
  • 193
4

In preference order: 4,1,2,3 Do not use UUIDs as the clustering key if using SQL server as, not only will it fragment badly, the clustering key is used in all non clustered indexes and you'd add those bytes to each index row. Fragmentation can be mitigated by using NEWSEQUENTIALID but usually prefer a bingint identity for your Clustering Key over a GUID to prevent bloat in other indexes.

The difference between choosing 1 over 2 will depend how more efficient the database handles two columns of basic types over a single column fixed array. It should be easy enough to test with dummy data. Look at the speed of your queries as well as the size of indexes and data. Small+fast is the best!

GilesDMiddleton
  • 236
  • 1
  • 2
  • 8
1

One would have to suppose that any natively-supported data type would be better optimised in the product than anything that could be put together as a client of that product. After that, whatever has the smallest byte count so you get the maximum rows per page.

Michael Green
  • 25,255
  • 13
  • 54
  • 100
0

Very late, I know, Vlad. UUIDv7 is excellent for database inserts as it supports decentralized generation as well as efficient clustered index insertion.

Andrew G
  • 61
  • 3