2

I need a tool/command which will check a table then say something like that:

  • columnA use tinyint instead of int
  • columnB use enum instead of varchar

I've seen something like this while ago so I do know it exists, but search gives nothing or I'm asking it in a wrong way.

Putnik
  • 295
  • 2
  • 4
  • 15

2 Answers2

3

This is such a quick and dirty question.

This deserves a quick and dirty answer.

Good News : PROCEDURE ANALYSE()

Bad News : It is Deprecated in 5.7.18 and will not be available in MySQL 8.0

I have discussed PROCEDURE ANALYSE() over the years: See my old posts.

In your case, you would just do this:

SELECT columnA,columnB FROM yourtable PROCEDURE ANALYSE();

and the output will tell you the min value, max value, avg value, and recommended datatype.

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
1

For huge tables, the main part of performance is I/O. The smaller the datatypes are, the less I/O is needed.

When laying out a schema, think about

  • Use the smallest ...INT that can safely hold the possible values. Most people blindly use the 4-byte INT, even for true/false flags. Some products blindly use the 8-byte BIGINT. BIGINT is not justified 98% of the time.
  • Use UNSIGNED when appropriate (which is most of the time).
  • Do not blindly say VARCHAR(255) for strings.
  • There is a 3-way battle between ENUM, TINYINT UNSIGNED, and VARCHAR; each has advantages and disadvantages. The first two are only 1 byte.
  • Indexes take space, too, and have copies of the columns. But that is another large discussion. Indexes are worth having, but don't blindly "index every column".
  • Use the appropriate CHARACTER SET.
  • Use CHAR only for truly fixed-length strings. Such are almost always CHARACTER SET ascii -- think country_code, postal_code, etc.
  • Never use TINYTEXT. (The other sizes of TEXT are useful.)

It is good to get into the habit of using smaller datatypes, even for non-huge tables.

For tables that I write, I "know" what size the ints are / will be. Without PROCEDURE ANALYSE, I simply say SELECT MIN(a), MAX(a) ... for numbers; SELECT MAX(LENGTH(s)) ... for VARCHAR and TEXT, but I add a fudge factor for future strings.

Rick James
  • 80,479
  • 5
  • 52
  • 119