I have a large table
=> \d balances_snapshots;
Table "public.balances_snapshots"
Column | Type | Collation | Nullable | Default
---------------+-----------------------------+-----------+----------+------------------------------------------------
user_id | integer | | |
asset_id | text | | |
timestamp | timestamp without time zone | | | now()
total_balance | numeric | | not null |
id | integer | | not null | nextval('balances_snapshots_id_seq'::regclass)
Indexes:
"balances_snapshots_pkey" PRIMARY KEY, btree (id)
"balances_snapshots_asset_id_total_balance_positive_idx" btree (asset_id) WHERE total_balance > 0::numeric
"balances_snapshots_timestamp_idx" btree ("timestamp")
"balances_snapshots_total_balance_asset_id_idx" btree (total_balance, asset_id DESC)
"balances_snapshots_user_id_idx" btree (user_id)
Foreign-key constraints:
"balances_snapshots_asset_id_fkey" FOREIGN KEY (asset_id) REFERENCES assets(id) ON UPDATE CASCADE ON DELETE CASCADE
"balances_snapshots_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id) ON DELETE CASCADE
And a simple query:
explain analyze select 'test' where
'test' IN (
SELECT asset_id FROM balances_snapshots WHERE total_balance > 0 GROUP BY asset_id
);
which takes 30 seconds to execute on Google Cloud SQL 8 vCPU, 18 RAM despite it uses Index Only Scan with zero Heap Fetches:
Result (cost=5088867.27..5088867.28 rows=1 width=32) (actual time=31661.912..31663.750 rows=0 loops=1)
One-Time Filter: (hashed SubPlan 1)
SubPlan 1
-> Group (cost=1000.60..5088861.49 rows=2310 width=9) (actual time=5.433..31652.313 rows=3455 loops=1)
Group Key: balances_snapshots.asset_id
-> Gather Merge (cost=1000.60..5088849.94 rows=4620 width=9) (actual time=5.431..31644.213 rows=9447 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Group (cost=0.57..5087316.65 rows=2310 width=9) (actual time=0.035..23003.921 rows=3149 loops=3)
Group Key: balances_snapshots.asset_id
-> Parallel Index Only Scan using balances_snapshots_asset_id_total_balance_positive_idx on balances_snapshots (cost=0.57..4698961.54 rows=155342044 width=9) (actual time=0.033..12475.044 rows=125137693 loops=3)
Heap Fetches: 0
Planning Time: 0.252 ms
Execution Time: 31663.830 ms
(14 rows)
Is it normal for postgres to take 12 seconds on that and if yes, the only way to boost this is to migrate to partitioned table, right?
I understand that this query can be more efficient with where clause but the query above is extracted from bigger one just to identify the cause of problem. Here is the bigger query:
explain analyze SELECT * FROM assets
WHERE id IN (
SELECT id FROM assets a WHERE (
a.last_updated_timestamp IS NULL OR
a.last_updated_timestamp < (NOW() - INTERVAL '1 DAY')
)
) AND (
id IN (
SELECT asset_id FROM balances_snapshots WHERE total_balance > 0 GROUP BY asset_id
)
);
Nested Loop (cost=1005.03..5108401.69 rows=727 width=112) (actual time=214.597..32211.567 rows=1001 loops=1)
-> Nested Loop (cost=1003.18..5104135.80 rows=727 width=20) (actual time=214.586..32197.553 rows=1001 loops=1)
-> Group (cost=1000.60..5088861.49 rows=2310 width=9) (actual time=5.807..32080.773 rows=3455 loops=1)
Group Key: balances_snapshots.asset_id
-> Gather Merge (cost=1000.60..5088849.94 rows=4620 width=9) (actual time=5.805..32072.921 rows=9440 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Group (cost=0.57..5087316.65 rows=2310 width=9) (actual time=0.032..23878.886 rows=3147 loops=3)
Group Key: balances_snapshots.asset_id
-> Parallel Index Only Scan using balances_snapshots_asset_id_total_balance_positive_idx on balances_snapshots (cost=0.57..4698961.54 rows=155342044 width=9) (actual time=0.031..12832.019 rows=125137693 loops=3)
Heap Fetches: 0
-> Bitmap Heap Scan on assets a (cost=2.58..6.60 rows=1 width=11) (actual time=0.023..0.023 rows=0 loops=3455)
Recheck Cond: (id = balances_snapshots.asset_id)
Filter: ((last_updated_timestamp IS NULL) OR (last_updated_timestamp < (now() - '1 day'::interval)))
Rows Removed by Filter: 1
Heap Blocks: exact=3455
-> Bitmap Index Scan on assets_pkey (cost=0.00..2.58 rows=1 width=0) (actual time=0.013..0.013 rows=2 loops=3455)
Index Cond: (id = balances_snapshots.asset_id)
-> Bitmap Heap Scan on assets (cost=1.86..5.87 rows=1 width=112) (actual time=0.009..0.009 rows=1 loops=1001)
Recheck Cond: (id = a.id)
Heap Blocks: exact=1001
-> Bitmap Index Scan on assets_pkey (cost=0.00..1.86 rows=1 width=0) (actual time=0.007..0.007 rows=2 loops=1001)
Index Cond: (id = a.id)
Planning Time: 0.889 ms
Execution Time: 32212.740 ms
(25 rows)
PostgreSQL 13.4