2

We are designing a new system to handle customers at our firm and me and another developer are split on wether we should use the customers personal numbers as a primary key. My preference would be to auto increment a unique id (int) for each new customer.

Currently all customers we add do have a personal number but are there any good examples of why this may be a good or bad implementation?

One argument is that using an incremental id would simply add an extra field which is unnecessary and take up more space.

The personal number is stored as a varchar and the customer table holds close to a million customers. The personal number or id would be added as a foreign key for orders and customer support requests.

How do I know whether these are good or bad implementations? How would I argue either side?

Matkey
  • 21
  • 3

3 Answers3

3

Some good criteria for choosing or designing keys are: Simplicity, Stability and Familiarity but first you need to understand the fundamental business requirement your key is supposed to address. Ask yourself how you intend to identify customers in your business processes. Assuming you have a requirement to identify customers and Customer Number is the means your business will use to do that then it makes perfect sense as a key in your customer table - not necessarily the only key, but certainly a key of that table.

nvogel
  • 3,807
  • 21
  • 25
2

What we do in the BI world is keep all Natural Keys from source systems while concurrently using a DB generated ID (Surrogate Key). Your person number would be the NK in this terminology. Keep it to allow trace-ability and change tracking when, or if, this number ever changes. You retain all historic/source system IDs, while enabling a new and more efficient/controlled PK in your solution.

Use the incrementing integer for performance gains in joins, simplicity of querying, and DB consistency. When you control the ID you can guarantee uniqueness and when/how to assign this. With an externally provided ID you have no control over potential changes or clashes.

I wouldn't think you could have a space concern. Even a rough estimate of the space needed to store 1 million 8 byte integers in a standard disk size, say terabytes, is inconsequential.

Dave
  • 2,399
  • 1
  • 12
  • 20
1

What you are describing is generally bad practice. You mention dashes in the number but don't mention what the number is. I'm guessing that it's something along the lines of a social security number or a phone number.

If it's a social security number then there may be privacy concerns with using this number. You won't be able to mask the number for security purposes without destroying any foreign key constraints you have in place.

If it's a phone number, what is going to be done when someone's phone number changes or they have multiple phone numbers? (I.e., they gave you their work number but now want to use their cell).

Both of these also have concerns as the number can be determined through external databases (phone books, tax records, etc.).

I would suggest that you use an auto-incrementing ID for the primary key of these entities. If you use this as the key for the clustered index then that will have the added benefit of low fragmentation as new entities are added at the end of the table rather than needing to be inserted at random.

Jonathan Fite
  • 9,414
  • 1
  • 25
  • 30