0

I'm doing some benchmarks and I found a strange thing - real table size is in 7 times bigger than estimated size

I created a table with 1 int column (4 bytes)

CREATE TABLE `t_int` (
  `c_int` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

And inserted 57M records with value 1. I expected to get table 228Mb (4 bytes * 57M records), but got 1.8Gb

           Name: t_int
         Engine: InnoDB
        Version: 10
     Row_format: Dynamic
           Rows: 57391085
 Avg_row_length: 36
    Data_length: 2089795584
Max_data_length: 0
   Index_length: 0
      Data_free: 5242880
 Auto_increment: NULL
    Create_time: 2017-01-12 22:42:23
    Update_time: NULL
     Check_time: NULL
      Collation: utf8_general_ci
       Checksum: NULL
 Create_options: 
        Comment: 

optimize tables has not changed the results

Why is that?

MySQL 5.7.11

Alexey
  • 288
  • 4
  • 14

3 Answers3

1

For each row InnoDB use additional disk space, such as:

If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.

Records in the clustered index contain fields for all user-defined columns. In addition, there is a 6-byte transaction ID field and a 7-byte roll pointer field.

so, when You test it as described - it will use 30+- bytes per row, and this is will be similar with result

additional source of information:

a_vlad
  • 3,705
  • 2
  • 13
  • 17
1

I will give this a try :

So as per the below snapshot taken from this link : http://dev.mysql.com/doc/refman/5.7/en/show-table-status.html

For InnoDB Data_length = Clustered Index size in pages x InnoDb page size

enter image description here

Now according to below screen shot taken from this link : http://dev.mysql.com/doc/refman/5.7/en/innodb-restrictions.html the default InnoDb page size should be equal to 16 kbenter image description here

Now the individual index size can be found using the below query.

Source 1,2

select database_name, table_name, index_name, stat_value*@@innodb_page_size
from mysql.innodb_index_stats where stat_name='size';

So taking only the index size values for your table from above output and considering your InnoDb page size to be equal to 16 kb you can calculate and see the actual size or data_length.

Note: I don't know why or how but product of you expected output 228MB(233472 Bytes as calculated per the inputs 4 bytes * 57M records) with maximum row length for default innodb page size of 16 Kb which is 8000 bytes gives 1867776000 bytes which when converted to Gb is 1.867776 Gb which is almost equal to size you said you are getting ie. 1.8 Gb .

Note 2: Please also read http://dev.mysql.com/doc/refman/5.7/en/innodb-physical-record.html for learning further about how row length can vary depending on the Row Formats,and if no primary key is defined as already indicated by @Rick James in his answer.

Shivam Kumar
  • 364
  • 1
  • 2
  • 8
1

There is a lot of overhead, and the overhead dominates when all you have is on 4-byte column:

Note: The following refers only to InnoDB tables.

  • Length of the column - probably 1 byte in this case.
  • Overhead for each row - I've seen 29 bytes, but perhaps it is less.
  • NULL - one byte for each 8 nullable columns (I think).
  • BTree 'waste' - At least 1/16th of each block, 42% when randomly inserting rows. (OPTIMIZE TABLE would temporarily clean up the 42%.)
  • PRIMARY KEY - Since you did not specify an explicit PK, one was generated for you. It is 6 bytes per row.
  • Allocation units - Once the table gets more than tiny, space is allocated in chunks of 8(?) megabytes.
  • Non-leaf nodes for the PK - on the order of 1%.
  • Deleted rows may or may not recover space.
  • MVCC and transactions lead to multiple copies of rows; this may or may not be recovered from.
  • Large TEXT and BLOB columns are stored off-page; this gets quite complex to predict.
  • Maybe more.

So, I am surprised that you got only 7x. But it is a very unrealistic table. I have measured many InnoDB tables. Most are between 2x and 3x the expected size.

The overhead is there for a variety of good reasons, usually related to performance.

Rick James
  • 80,479
  • 5
  • 52
  • 119