Imagine the following situation:
You have a MySQL server S and two clients, A and B.
Both clients connect to S (e.g. using the CLI).
A starts a transaction and then locks a table with WRITE permissions and inserts a row into the table.
Now B tries to select all rows from the table and - of course - has to wait until A has committed AND unlocked the table.
Imagine that A's connection is interrupted before it is able to commit or rollback. In my special use-case, the connection might be interrupted at any point in time.
I tested this scenario using three virtual machines which are connected to a common virtual bridge. Using brctl I can simply remove individual VMs in order to simulate connection interruptions.
The result
Client B is stuck, because it waits for the table to get unlocked.
Typing SHOW PROCESSLIST in the CLI of the server shows me this:
+----+------+-----------------+----------+---------+------+---------------------------------+----------------------------+----------+
| Id | User | Host | db | Command | Time | State | Info | Progress |
+----+------+-----------------+----------+---------+------+---------------------------------+----------------------------+----------+
| 8 | root | B:38129 | foo | Query | 1676 | Waiting for table metadata lock | select blockname from root | 0.000 |
| 9 | root | A:54806 | foo | Sleep | 1679 | | NULL | 0.000 |
| 12 | root | localhost | NULL | Query | 0 | init | show processlist | 0.000 |
+----+------+-----------------+----------+---------+------+---------------------------------+----------------------------+----------+
So the server thinks that the connection from A is still alive, while it already was lost minutes ago. Having a look at netstat also tells me that the connection is still considered to be ESTABLISHED and still alive.
I also observed the bridge: Not a single packet from S to A.
Client B is stuck forever. I have to restart the whole server to unlock the table.
What I expected
I want my server to detect (after some predefined time) that the connection to A is lost even if A did not send any more queries. Then I want my server to rollback any uncommitted transactions and finally to unlock the table.
How can I realise this?
Why aren't there any TCP keepalives from S to A?
Is it possible to define any kind of timeouts?
I played around with the variables wait_timeout, innodb_lock_wait_timeout, and interactive_timeout.
interactive_timeoutis not interesting because my productive system will not use the CLI.innodb_lock_wait_timeoutseems not to be respected in any way.wait_timeoutindeed improves my situation: idling sessions are killed after the timeout. But I only want them to be killed in case of connection interruption. I want to keep the connection alive to improve performance (save tcp handshakes, SQL logins, db selection...)