2

I have already seen some posts making comparisons of varchar with numeric fields, but most only compare with equal values ​​and do not talk about performance in primary keys.

I have a user registration table where each user needs to have a unique code and the auto-incrementing sequential numbers (1,2,3 ...) are not a good idea.

So I'm using PHP's uniquid() command to generate user codes. Something like this: it5f29b715cff42 containing on average 13 to 15 characters in varchar type fields

I use this code in the primary key because it is the main code of the user's registration, and there will always be a user search for this code in the table.

Some people have suggested switching to a varbinary type field to gain performance. And yet I still don't know if it's the best idea.

So I thought of turning this code into numbers using PHP's hexdec() command. Leaving the code more or less like this: 10001674124319326018 containing an average of 16 to 20 characters. And storing that code in a bigint type field.

I know that the bigint field will always be 8 bytes, but how many bytes would varchar and varbinary in that number of characters occupy? What is the best option to use as a primary key in a table with an average of 2 billion records?

Fernando VR
  • 133
  • 1
  • 4

2 Answers2

2

Let's itemize the purposes/goals of a PRIMARY KEY:

  • Uniquely identify each row of a table. This is a requirement for InnoDB.
  • It provides the disk layout order for the data. This sometimes plays into performance issues. (Regular UUIDs are bad for performance because the "next" UUID is nowhere near the "last" UUID. That is the scattering of the data is detrimental.)
  • It "should be" small, since it is used a lot -- for joining, for locating the data when using a secondary index, etc.
  • If there is a single place for assigning the PK, simply use AUTO_INCREMENT. It is fast and well optimized.
  • If you must create the PK from multiple clients, explain further; there are other tricks.

On the other hand, I suggest that fewer than 5% of tables are so big or so busy that this much effort should be spent on designing the FK.

The size and datatype makes some difference in performance. (Others will say it makes a big difference.)

Hex is more compact than decimal, as you found. But you may as well go to more compact by using bin2hex. Unfortunately, uniquid() may be an odd length, so tacking on a '0' would let you get a BINARY(7) from the number. Messy.

Bottom line: For a table that is, or will grow to be, over a billion rows, simply use BIGINT AUTO_INCREMENT. The rest of the stuff is not worth the hassle.

Then shift your focus to other things, such as normalization, partitioning, sharding, etc.

Rick James
  • 80,479
  • 5
  • 52
  • 119
1

Bignint is much faster than cvarchar or varbinary. at anbay time..

And so you can have 18446744073709551615 (unsigned) Rows.

You would use uuids, when you enter data in different databases and try to unite them.

but if you only need unique ids, use bigint.

nbk
  • 8,699
  • 6
  • 14
  • 27