I have a PostgreSQL database with a lot of tables (currently 93k) and I'm seeing issues with transaction id age threatening wraparound, because autovacuum is not getting triggered for clean tables that haven't been written to in a long time.
For example, I have a table that I can tell hasn't been written to since '2018-09-19 10:30:43.625069+00' thanks to the only inserts to this table always setting an updated_at column (current datetime is '2018-10-04 23:58:25.545881+00'), has n_dead_tup = 0, and has age(relfrozenxid) > 10000000.
I have autovacuum_freeze_max_age = 10000000 so this should mean this table will get autovacuumed, but instead I see it get past this point and then autovacuum doesn't get triggered for the table (it's currently at age(relfrozenxid) > 70000000 and autovacuum still hasn't triggered.
I currently have autovacuum set to be maximally aggressive and yet this is still happening. Here are the autovacuum settings:
track_counts = on
autovacuum = on
autovacuum_vacuum_cost_delay = 0
autovacuum_vacuum_cost_limit = 10000
autovacuum_vacuum_scale_factor = 0
autovacuum_vacuum_threshold = 1
autovacuum_freeze_max_age = 10000000
vacuum_freeze_min_age = 1000000
(yes, I realize autovacuum_vacuum_cost_limit doesn't matter because autovacuum_vacuum_cost_delay is 0)
Currently I'm forced to periodically manually trigger vacuums for these tables to keep things from getting out of hand. This feels like a hack, though.
Is there a way to get autovacuum to vacuum these tables, or equivalently a reason why autovacuum is ignoring these tables?
ETA
In case anyone comes here wanting to see my script for dealing with this, here it is:
#!/bin/bash -e
while true; do
sleep 60 #just in case there's nothing sleep so we don't run this too hard; at the start to prevent rapid process bouncing when postgres is offline
psql -d warehouse -t -c "select nspname || '.' || relname from pg_class join pg_namespace on pg_namespace.oid = pg_class.relnamespace where age(relfrozenxid) > 100000000 and relfrozenxid != 0 order by age(relfrozenxid) desc;" | sed '/^$/d' | parallel -j10 -q bash -c 'echo -n "autovacuum {} to prevent wraparound: " ; psql -d warehouse -c "vacuum freeze {}"'
echo -n "vacuum postgres database: "
psql -d postgres -c "vacuum freeze" #just to be safe
done
You'll probably want to up superuser_reserved_connections by 10 or whatever number you specify to parallel using the -j flag to make sure you can still access the database if regular connections are exhausted without shutting this down.