4

Postgres documentation on Serializable isolation level counterposes Serializable with explicit locking, and makes it sound as though explicit locking should not be needed when using Serializable. I'm looking to understand

  1. Am I misunderstanding the guarantees of Serializable;
  2. Do I need to use explicit locks in combination with Serializable level, or should I just pick one;
  3. Why does Serializable behave the way it does in my case.

Relevant excerpts from documentation that confuse me:

This level emulates serial transaction execution for all committed transactions; as if transactions had been executed one after another, serially, rather than concurrently.

To guarantee true serializability PostgreSQL uses predicate locking, which means that it keeps locks which allow it to determine when a write would have had an impact on the result of a previous read from a concurrent transaction, had it run first. In PostgreSQL these locks do not cause any blocking and therefore can not play any part in causing a deadlock. They are used to identify and flag dependencies among concurrent Serializable transactions which in certain combinations can lead to serialization anomalies. In contrast, a Read Committed or Repeatable Read transaction which wants to ensure data consistency may need to take out a lock on an entire table, which could block other users attempting to use that table, or it may use SELECT FOR UPDATE or SELECT FOR SHARE which not only can block other transactions but cause disk access

Sounds like an implication that FOR SHARE / FOR UPDATE should only be needed in Read Committed / Repeatable Read.

The monitoring of read/write dependencies has a cost, as does the restart of transactions which are terminated with a serialization failure, but balanced against the cost and blocking involved in use of explicit locks and SELECT FOR UPDATE or SELECT FOR SHARE, Serializable transactions are the best performance choice for some environments.

Compares Serializable and explicit locks as though they are interchangeable alternatives with their own pros and cons.

Eliminate explicit locks, SELECT FOR UPDATE, and SELECT FOR SHARE where no longer needed due to the protections automatically provided by Serializable transactions

Recommends to remove explicit locks from Serializable transactions.

Scenario that I encountered

The data model is as follows.

books

Column name Data type
id UUID PRIMARY KEY
text TEXT
is_published BOOL

favorites

Column name Data type
id INT PRIMARY KEY
user_id UUID FOREIGN KEY
book_id UUID FOREIGN KEY

I need to ensure that a book can only be added to a user's favorites if it is_published. If a book is unpublished while it is being added to favourites, I want the transaction to fail. So I run the following:

-- session 1
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT is_published FROM book WHERE id = $1;
-- session 2
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
UPDATE book SET is_published = false WHERE id = $1;
COMMIT;
-- session 1
-- application checks the value of is_published.
-- if it is false, the transaction rolls back.
INSERT INTO favorites (user_id, book_id) VALUES ($2, $1);
COMMIT;

For some reason, session 1 is still able to commit successfully, which in my mind conflicts with the definition of Serializable: if the transactions were to execute in a different order it would have produced a different result.

However, if I make data consistency checks "bidirectional", it for some reason works as needed:

-- session 1
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT is_published FROM book WHERE id = $1;
-- session 2
START TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SELECT count(*) FROM favorites WHERE book_id = $1;
-- application checks if the book is added to favorites by someone
-- if it is, the transaction rolls back
UPDATE book SET is_published = false WHERE id = $1;
COMMIT;
-- session 1
INSERT INTO favorites (user_id, book_id) VALUES ($2, $1);
-- transaction fails with error: "Canceled on identification as a pivot, during write"

So my questions are:

  1. When documentation says as if transactions had been executed one after another, serially, does this only apply to transaction bodies varbatim, i.e. any conditional application logic that happens outside of SQL breaks this guarantee?
  2. Why on Earth does it work correctly in the second case?
  3. Do I ever need to use explicit locks if I correctly use transaction isolation levels throughout the app?
mustaccio
  • 28,207
  • 24
  • 60
  • 76
Kit Isaev
  • 141
  • 4

0 Answers0