Is it better to use (for perfomance tuning) AUTOCOMMIT = 0 before bulk inserts of data in MySQL, if the insert query looks like
INSERT INTO SomeTable (column1, column2) VALUES (val1,val2),(val3,val4),... ?
Is it better to use (for perfomance tuning) AUTOCOMMIT = 0 before bulk inserts of data in MySQL, if the insert query looks like
INSERT INTO SomeTable (column1, column2) VALUES (val1,val2),(val3,val4),... ?
According to the documentation, AUTOCOMMIT should be turned off in InnoDB.
When importing data into InnoDB, turn off autocommit mode, because it performs a log flush to disk for every insert.
When doing bulk inserts into tables with auto-increment columns, set
innodb_autoinc_lock_modeto 2 instead of the default value 1
You have a trade-off you need to be aware of.
Granted, it is true that a log flush happens with each INSERT involved with autocommit=1. Nevertheless, are there any consequences of setting autocommit=0 ?
Think about the redo logs (ib_logfile0,ib_logfile1) and the undo tablespace (inside ibdata1). Change information must be stored somewhere in case the INSERTs need to be rolled back or recovery is initiated after a crash.
Either way, there will be some disk I/O to contend with.
Additionally, consider innodb_log_buffer_size
These are facts. Let the buyer beware.
Turning off definitely helps if innodb_flush_log_at_trx_commit is set to 1.
Table:
CREATE TABLE tab1
(col1 INT, col2 VARCHAR(20), col3 INT, KEY idx_tab1_col1(col1))ENGINE = INNODB;
Test:
Results:
If you are inserting 40K rows per INSERT statement, then commit it right away. (This implies, for simplicity, having autocommit=ON.)
If instead, you have thousands of such bulk inserts and use autocommit=OFF or BEGIN, then you are building up a huge transaction. This will overflow the logs (needed to undo the inserts in case of a crash). InnoDB can handle the overflow, but it slows things down.
On the other hand, if you need the entire load to be 'atomic', then go ahead and make it one big transaction.
Similarly, innodb_flush_log_at_trx_commit=2 and innodb_autoinc_lock_mode=2 help with speed (at the expense of safety). However, this is 'global', thereby impacting all connections.
If your task is to load a bunch of data, especially if this is an initial load, then you could take this approach: Run the task with the fastest settings; in the unlikely event of a crash, drop the tables and start over.
If you are loading into a Master, then presenting smaller chunks to the Slaves helps with the replication flow (especially in Statement Based Replication).