guaranteed to remain the same "in the transaction"
That has two meanings, and that's the problem. You have what's happening outside of the transaction and what's happening inside with the "snapshot". Two things can happen to the result of count(*):
- The count can change
Transaction level READ COMMITTED (and READ UNCOMMITTED†).
- The count cannot change
Transaction levels REPEATABLE READ and SERIALIZABLE.
Changing and not changing isn't everything you need to know. REPEATABLE READ and SERIALIZABLE work on snapshots. That is to say, the counts(*) won't change, but that doesn't mean anything with regard to what may already changed in the database.
Let's simplify and review some things and call the initialization code above REINIT. We will play with only these two statements.
SELECT count(*) FROM foo WHERE x IS NULL;
UPDATE foo SET x = 1 WHERE x IS NULL;
Now, let's say we run two sessions
REINIT
1# BEGIN; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
1# SELECT count(*) FROM foo WHERE x IS NULL;
2# UPDATE foo SET x = 1 WHERE x IS NULL;
1# SELECT count(*) FROM foo WHERE x IS NULL;
Now what does count(*) show in 1#'s transaction? And after both transactions commit, #1 and then #2? Spoiler alert:
In the transaction, it will show the same number. Outside of the transaction, it will show 0 because the UPDATE has already committed.
Now, in a lower transaction level then the REPEATABLE READ, like READ COMMITTED the second SELECT by #1 see the committed rows. And, in a higher transaction level. The first SELECT count(*) will obtain a predicate lock on just the rows where x IS NULL. So then what happens when we move up a level to SERIALIZABLE and run the same sequence, now with predicate lock?
REINIT
1# BEGIN; SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
1# SELECT count(*) FROM foo WHERE x IS NULL;
2# UPDATE foo SET x = 1 WHERE x IS NULL;
1# SELECT count(*) FROM foo WHERE x IS NULL;
Spoiler alert:
Nothing. Same thing.
Why? The concurrency model doesn't care about UPDATEs to things unless something also tries to modify those rows and both intend to commit. Both are SELECTing and not modifying their snapshots. So by extension,
if (SELECT count(*)
FROM foo
WHERE x IS NULL
) < arg THEN
RETURN 0 ;
end if ;
- Then in the default transaction level,
READ COMMITTED it may see a number that is not the same as the rest of the transaction -- a concurrent transaction could have committed between the second SELECT. That makes it useless WITHOUT REPEATABLE READ or SERIALIZABLE.
- But in the other transaction modes you could either face a lock issue in
REPEATABLE READ, or a failure upon commit in SERIALIZABLE if anything else touches the table from outside of the transaction. So again, there is limited point. If the count(*) query return rows, and, then, for example, other code may attempt to update those rows just to find that they've already been modified or that they're no longer there when the working snapshot goes to commit.
So don't conditionally do stuff in transactions. Do stuff, and then handle it.
† The SQL standard provides for a level READ UNCOMMITTED. In PostgreSQL that level aliases that to the tighter isolation level READ COMMITTED.