9

I am trying to design the DB schema of my new project and I need some suggestions. I am thinking to have one extra column for some tables which is a UUID. I wanted mainly for tables which contain data which are accessible through REST, so I don't want to expose the primary key. My concern is that when I want to update a table, I need to execute one more query just to take the original primary key.

What do you think?

pik4
  • 385

5 Answers5

12

The biggest myth when designing applications is that you are only allowed to have one key.

Have multiple keys, go ahead and do it. Your application is allowed to have a different primary key than your database; I know this may sound strange at first, but you get the best of both worlds going this route.

UUIDs have many advantages, but make generally horrible clustered indexes. So, you can just use a int as a PK/clustered index in your DB, but also have another unique index on your external id column. This will may require an extra join for many queries, but that's okay, because in relational DBs, inner joins on auto-incremented int columns are blazing fast.

Another advantage to this dual key approach is the ability to easily geographically distribute your DB. Since UUIDs are globally unique, two geographically separated DBs will not have to slow down from coordinating their keys, as long as you make sure the DB int PK never leaves the DB and is not used by the rest service.

Here's another point, your external Id doesn't even have to be a UUID, there can be other options too.

The external can be an int as well, or a string, or the natural key of the entity; anything. This would make your urls less ugly than a UUID

Don't be afraid to have multiple keys, hiding your DB key from your rest api consumers is a good thing.


The most keys I have ever used has been two, but in the right situation, I can imagine up to four being justified (DB key, application key, business key, customer key)

8

There is no point having two primary keys.

  1. UUIDs are slow to index.

No they are not. Stop worrying about arcane numbers like pages and fragmentation and do some tests.

https://www.mssqltips.com/sqlservertip/5105/sql-server-performance-comparison-int-versus-guid/

http://byterot.blogspot.com/2013/02/performance-series-guids-vs-identity-rdbms-sql.html

https://www.cybertec-postgresql.com/en/int4-vs-int8-vs-uuid-vs-numeric-performance-on-bigger-joins/

In any case, If you have both and index the UUID then you have the same problem

  1. UUIDs take up more space. well yes, as long as you can get away with those low max number ints.

In any case if you have both its definitely bigger

  1. UUIDs are ugly. Well you wouldn't want to type one, but do you want to type 13243444444431 or Gdih£$%d1 ? if you need a pretty url for SEO or some other reason, then there are better solutions than adding more id columns to your table.

Sequential ints have some major issues which are solved with UUIDs one of which is guessing the next id or how many orders you have taken.

These are real security and commercial concerns and should, on their own be enough to justify not using auto_inc ints at all.

Don't try to patch over the problems with hacky solutions, just replace them with the industry standard.

Ewan
  • 83,178
1

If your only goal is to hide the primary key from your REST interface, then I'd advice against it. Instead, encrypting your keys would be a good solution for this.

1

This is a common problem that I've come across myself many times. I ended up going down the "just use a number for the id" route, meaning it's not quite as ugly to the clients to see those UUIDs.

The only other way, as you already eluded to, is to hide that ID behind some kind of query, which you'll have to feed parameters to in order to retrieve the correct UUID, however, that means you may as well be using those parameters as the primary key for table!

I'd say there's no way to do REST (i.e. state transfer of an model) without being able to identify the instance you're RESTing, so you're going to need a key, whatever form that key takes.

If your concern is due to security, I see the problem as is in your server message validation code, not that the client sees Ids. If a hacker wants to load some data from your table, they won't need to ids to do so. And you ought be validating the requests too, so no dodgy stuff comes through to your business layers anyway, i.e. Authentication and Authorisation of messages.

To come back to your question:

"accessible through REST, so I don't want to expose the primary key"

I think that it's the opposite, you NEED to expose the primary key in order to do REST.

EDIT: As @Murph pointed out, it actually DOES give attackers a better chance if you give Ids to the client - I did not know this. It may be that other answers here (I'm looking at the shortest one saying to encrypt the key) may be more correct to your situation. Edit 2: Seems UUIDs are no safer than ints...

Dan Rayson
  • 182
  • 7
1

The problem with using primary keys as the public identifier for your entity is that youre leaking details of your data storage solution to your clients (and consequently, all the way through your application). This can have security implications, aswell as meaning that changes to your applications backend can be more complicated. UUIDs go some way to solving this problem by providing an alternative identifier that is entirely independent of your database technology.

UUIDs are just one solution to this problem, with the inherent advantage that for all practical purposes they are guaranteed to be unique. However, if you have gone to the effort of using something other than the primary key as your identifier, you might find that using an arbitrary string or just another numerical identifier provides nicer urls.

One other consideration is that if you're using GUIDs/UUIDs as the public identifier for your entity, you will likely need some sort of index for them. Given that they take up substantially more space than integers, your index will be much larger.

richzilla
  • 1,153