1

EXPLAIN shows that PostgreSQL chooses a bad plan for hash join. it should have first iterate by tags_name_value_height2 in the table tags and use it to find corresponding transactions.id, but instead Postgres does a stupid reverse: it first iterates using transactions_id on transactions.

It's despite I do have all needed (composite) indexes.

How to make it work the right way?

explain
select *
from "transactions" 
join tags on transactions.id = tags.tx_id
where "tags"."name" = 'ZGF0YWJhc2U'
and "tags"."value" in ('c29sYXJ3ZWF2ZS10ZXN0bmV0LWFyY2hpdmU')
limit 2;
                                                        QUERY PLAN                                                        
--------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=246323.53..246710.80 rows=2 width=1964)
   ->  Hash Join  (cost=246323.53..12515067.96 rows=63359 width=1964)
         Hash Cond: ((transactions.id)::text = (tags.tx_id)::text)
         ->  Index Scan using transactions_id on transactions  (cost=0.56..10146177.49 rows=3822241 width=1803)
         ->  Hash  (cost=244045.99..244045.99 rows=63359 width=161)
               ->  Index Scan using tags_name_value_height2 on tags  (cost=0.94..244045.99 rows=63359 width=161)
                     Index Cond: ((name = 'ZGF0YWJhc2U'::text) AND (value = 'c29sYXJ3ZWF2ZS10ZXN0bmV0LWFyY2hpdmU'::text))
 JIT:
   Functions: 13
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(10 rows)

Postgre 13. Maybe I should downngrade? Which version has good query hints?

arweave=# \d transactions
                                        Table "arweave.transactions"
    Column     |           Type           | Collation | Nullable |                  Default                  
---------------+--------------------------+-----------+----------+-------------------------------------------
 id            | character varying(64)    |           | not null | 
 owner         | text                     |           |          | 
 tags          | jsonb                    |           |          | 
 target        | character varying(64)    |           |          | 
 quantity      | text                     |           |          | 
 reward        | text                     |           |          | 
 signature     | text                     |           |          | 
 last_tx       | text                     |           |          | 
 data_size     | bigint                   |           |          | 
 content_type  | character varying(255)   |           |          | 
 format        | integer                  |           |          | 
 height        | integer                  |           |          | 
 owner_address | character varying(255)   |           |          | 
 data_root     | character varying(64)    |           |          | 
 parent        | character varying(64)    |           |          | 
 created_at    | timestamp with time zone |           |          | CURRENT_TIMESTAMP
 app           | character varying(64)    |           |          | 
 domain        | character varying(64)    |           |          | 
 namespace     | character varying(64)    |           |          | 
 seq           | integer                  |           | not null | nextval('transactions_seq_seq'::regclass)
Indexes:
    "transactions_pkey" PRIMARY KEY, btree (seq)
    "id_key" UNIQUE CONSTRAINT, btree (id)
    "index_app_transactions" btree (app)
    "index_domain_transactions" btree (domain)
    "index_namespace_transactions" btree (namespace)
    "owner1" btree (seq, owner)
    "owner2" btree (seq DESC, owner)
    "transactions_height2" btree (height DESC, id DESC)
    "transactions_id" btree (id)
    "transactions_id_index" UNIQUE CONSTRAINT, btree (height, id)
Referenced by:
    TABLE "tags" CONSTRAINT "tag_tx_id" FOREIGN KEY (height, tx_id) REFERENCES transactions(height, id)
    TABLE "tags" CONSTRAINT "tags_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES transactions(id)

arweave=# \d tags Table "arweave.tags" Column | Type | Collation | Nullable | Default
---------------+--------------------------+-----------+----------+----------------------------------- tx_id | character varying(64) | | not null | owner_address | text | | | height | integer | | | index | integer | | not null | name | text | | | value | text | | | created_at | timestamp with time zone | | | CURRENT_TIMESTAMP seq | integer | | not null | nextval('tags_seq_seq'::regclass) Indexes: "tags_pkey" PRIMARY KEY, btree (seq) "tags_created_at" btree (created_at) "tags_height1" btree (height, tx_id) "tags_height2" btree (height DESC, tx_id DESC) "tags_name_height1" btree (name, height, tx_id) "tags_name_height2" btree (name, height DESC, tx_id DESC) "tags_name_value_height2" btree (name, value, height DESC, tx_id DESC) "tags_tx_id" btree (tx_id) Foreign-key constraints: "tag_tx_id" FOREIGN KEY (height, tx_id) REFERENCES transactions(height, id) "tags_tx_id_fkey" FOREIGN KEY (tx_id) REFERENCES transactions(id)

(Note that the above are not yet very well constructed, but this should not affect this question.)

porton
  • 745
  • 3
  • 11
  • 28

1 Answers1

1

You are misreading the EXPLAIN output. Top nodes are executed last. In fact, the index tags_name_value_height2 on tags is read first.

Imagine a tree. The more each node is indented, the earlier it is executed. Details in the manual in the chapter Using EXPLAIN.

Your query might be optimized, but we'd need proper definitions to say more.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633