3

The data from one tables is dumped to file and inserted into calls_old table with load data infile it shows Avg_row_length 7366.

Later I dumped table with mysqldump, droped table, then created again and imported data. Now Avg_row_length is 471.

| Name       | Engine | Rows     | Avg_row_length | Data_length | Index_length | Data_free  |
+------------------------------+--------+---------+------------+----------+----------------+-

          before
| calls_old  | InnoDB |  1185666 |           7366 |  8733835264 |   4875419648 | 5329911808 |     

         after

| calls_old  | InnoDB |  1277654 |            471 |   601882624 |    464175104 |    4194304 

What causes this behavior?

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536
gilbertasm
  • 31
  • 3

1 Answers1

1

There are two distinct conditions you need to be aware of

CONDITION #1 : Fragmentation

It is obvious the table has experienced lots of INSERTs, UPDATEs, and DELETEs (or non-SELECT DML). Even loading data sorted by PRIMARY KEY order will make index pages lopsided, at least 45% sparse.

I have posts about this phenomenon before

Thus, data pages shrinking from 8733835264 (8.13G) to 601882624 (574M) and shrinking index pages from 4875419648 (4.54G) to (~443M) is not alarming. This will improve read performance.

CONDITION #2 : InnoDB Row Count

InnoDB is notorious for INFORMATION_SCHEMA.TABLES having spurious counts. MyISAM tables give accurate row counts in the table_rows column of INFORMATION_SCHEMA.TABLES.

InnoDB likes to do dives into the index pages in order to guess the number the rows in the table.

I have discussed this feature before as well

If you want an accurate count of an InnoDB table, you must bite the bullet and run

SELECT COUNT(1) FROM calls_old; 

which will produce the same count every time.

Running this:

SELECT table_rows FROM information_schema.tables
WHERE table_name = 'calls_old';

never gives you the same value each time.

In fact, if you run

SHOW INDEXES FROM calls_old;

look at the cardinality column. It will produce different numbers each time. You would have to lock it down by setting innodb_stats_on_metadata to 0. Even if you did that, the stats would become stale over time through normal non-SELECT DML just like a MyISAM table. Running ANALYZE TABLE calls_old with innodb_stats_on_metadata set to 0 will fix the index statistics.

Having innodb_stats_on_metadata remain at 1 will make ANALYZE TABLE calls_old completely useless. (See my post From where does the MySQL Query Optimizer read index statistics?).

Since the table is called calls_old, I can surmise that you do not need to adjust any other options since it may be an archive table as the name suggests.

EPILOGUE

The value of Avg_row_length is a good indicator that you should defragment the table. When you see an InnoDB table growing that much, you could just run

ALTER TABLE calls_old ENGINE=InnoDB;

to shrink that table.

Thus, the behavior you are seeing is driven by the two conditions I just discussed.

SUGGESTION

If you want to verify the contents, you should these two queries:

  • SELECT COUNT(1) FROM calls_old;
  • CHECKSUM TABLE calls_old;

before and after dumping/importing.

RolandoMySQLDBA
  • 185,223
  • 33
  • 326
  • 536