EXPLAIN shows that PostgreSQL chooses a bad plan for hash join. it should have first iterate by tags_name_value_height2 in the table tags and use it to find corresponding transactions.id, but instead Postgres does a stupid reverse: it first iterates using transactions_id on transactions.
It's despite I do have all needed (composite) indexes.
How to make it work the right way?
explain
select *
from "transactions"
join tags on transactions.id = tags.tx_id
where "tags"."name" = 'ZGF0YWJhc2U'
and "tags"."value" in ('c29sYXJ3ZWF2ZS10ZXN0bmV0LWFyY2hpdmU')
limit 2;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Limit (cost=246323.53..246710.80 rows=2 width=1964)
-> Hash Join (cost=246323.53..12515067.96 rows=63359 width=1964)
Hash Cond: ((transactions.id)::text = (tags.tx_id)::text)
-> Index Scan using transactions_id on transactions (cost=0.56..10146177.49 rows=3822241 width=1803)
-> Hash (cost=244045.99..244045.99 rows=63359 width=161)
-> Index Scan using tags_name_value_height2 on tags (cost=0.94..244045.99 rows=63359 width=161)
Index Cond: ((name = 'ZGF0YWJhc2U'::text) AND (value = 'c29sYXJ3ZWF2ZS10ZXN0bmV0LWFyY2hpdmU'::text))
JIT:
Functions: 13
Options: Inlining false, Optimization false, Expressions true, Deforming true
(10 rows)
Postgre 13. Maybe I should downngrade? Which version has good query hints?
arweave=# \d transactions
Table "arweave.transactions"
Column | Type | Collation | Nullable | Default
---------------+--------------------------+-----------+----------+-------------------------------------------
id | character varying(64) | | not null |
owner | text | | |
tags | jsonb | | |
target | character varying(64) | | |
quantity | text | | |
reward | text | | |
signature | text | | |
last_tx | text | | |
data_size | bigint | | |
content_type | character varying(255) | | |
format | integer | | |
height | integer | | |
owner_address | character varying(255) | | |
data_root | character varying(64) | | |
parent | character varying(64) | | |
created_at | timestamp with time zone | | | CURRENT_TIMESTAMP
app | character varying(64) | | |
domain | character varying(64) | | |
namespace | character varying(64) | | |
seq | integer | | not null | nextval('transactions_seq_seq'::regclass)
Indexes:
"transactions_pkey" PRIMARY KEY, btree (seq)
"id_key" UNIQUE CONSTRAINT, btree (id)
"index_app_transactions" btree (app)
"index_domain_transactions" btree (domain)
"index_namespace_transactions" btree (namespace)
"owner1" btree (seq, owner)
"owner2" btree (seq DESC, owner)
"transactions_height2" btree (height DESC, id DESC)
"transactions_id" btree (id)
"transactions_id_index" UNIQUE CONSTRAINT, btree (height, id)
Referenced by:
TABLE "tags" CONSTRAINT "tag_tx_id" FOREIGN KEY (height, tx_id) REFERENCES transactions(height, id)
TABLE "tags" CONSTRAINT "tags_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES transactions(id)
arweave=# \d tags
Table "arweave.tags"
Column | Type | Collation | Nullable | Default
---------------+--------------------------+-----------+----------+-----------------------------------
tx_id | character varying(64) | | not null |
owner_address | text | | |
height | integer | | |
index | integer | | not null |
name | text | | |
value | text | | |
created_at | timestamp with time zone | | | CURRENT_TIMESTAMP
seq | integer | | not null | nextval('tags_seq_seq'::regclass)
Indexes:
"tags_pkey" PRIMARY KEY, btree (seq)
"tags_created_at" btree (created_at)
"tags_height1" btree (height, tx_id)
"tags_height2" btree (height DESC, tx_id DESC)
"tags_name_height1" btree (name, height, tx_id)
"tags_name_height2" btree (name, height DESC, tx_id DESC)
"tags_name_value_height2" btree (name, value, height DESC, tx_id DESC)
"tags_tx_id" btree (tx_id)
Foreign-key constraints:
"tag_tx_id" FOREIGN KEY (height, tx_id) REFERENCES transactions(height, id)
"tags_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES transactions(id)
(Note that the above are not yet very well constructed, but this should not affect this question.)