Given this table posts_lists:
Table "public.posts_lists"
Column | Type | Collation | Nullable | Default
------------+------------------------+-----------+----------+---------
id | character varying(20) | | not null |
user_id | character varying(20) | | |
tags | jsonb | | |
score | integer | | |
created_at | integer | | |
Indexes:
"tmp_posts_lists_pkey1" PRIMARY KEY, btree (id)
"tmp_posts_lists_idx_create_at1532588309" btree (created_at)
"tmp_posts_lists_idx_score_desc1532588309" btree (score_rank(score, id::text) DESC)
"tmp_posts_lists_idx_tags1532588309" gin (jsonb_array_lower(tags))
"tmp_posts_lists_idx_user_id1532588309" btree (user_id)
Getting a list by tag is fast:
EXPLAIN ANALYSE
SELECT * FROM posts_lists
WHERE jsonb_array_lower(tags) ? lower('Qui');
Bitmap Heap Scan on posts_lists (cost=1397.50..33991.24 rows=10000 width=56) (actual time=0.110..0.132 rows=2 loops=1)
Recheck Cond: (jsonb_array_lower(tags) ? 'qui'::text)
Heap Blocks: exact=2
-> Bitmap Index Scan on tmp_posts_lists_idx_tags1532588309 (cost=0.00..1395.00 rows=10000 width=0) (actual time=0.010..0.010 rows=2 loops=1)
Index Cond: (jsonb_array_lower(tags) ? 'qui'::text)
Planning time: 0.297 ms
Execution time: 0.157 ms
Getting a list ordered by score, limit 100 - also fast:
EXPLAIN ANALYSE
SELECT *
FROM posts_lists
ORDER BY score_rank(score, id) DESC
LIMIT 100;
Limit (cost=0.56..12.03 rows=100 width=88) (actual time=0.074..0.559 rows=100 loops=1) -> Index Scan using tmp_posts_lists_idx_score_desc1532588309 on posts_lists (cost=0.56..1146999.15 rows=10000473 width=88) (actual time=0.072..0.535 rows=100 loops=1) Planning time: 0.586 ms Execution time: 0.714 ms
But combining the above two queries is very slow:
EXPLAIN ANALYSE
SELECT * FROM posts_lists
WHERE jsonb_array_lower(tags) ? lower('Qui')
ORDER BY score_rank(score, id) DESC
LIMIT 100;
Limit (cost=0.56..33724.60 rows=100 width=88) (actual time=2696.965..493476.142 rows=2 loops=1)
-> Index Scan using tmp_posts_lists_idx_score_desc1532588309 on posts_lists (cost=0.56..3372404.39 rows=10000 width=88) (actual time=2696.964..493476.139 rows=2 loops=1)
Filter: (jsonb_array_lower(tags) ? 'qui'::text)
Rows Removed by Filter: 9999998
Planning time: 0.426 ms
Execution time: 493476.190 ms
Why? How to improve the efficiency of the query?
Definition of the two functions used above:
create or replace function score_rank(score integer, id text)
returns text as $$
select case when score < 0
then '0' || lpad((100000000 + score) :: text, 8, '0') || id
else '1' || lpad(score :: text, 8, '0') || id
end
$$
language sql immutable;
create or replace function jsonb_array_lower(arr jsonb)
returns jsonb as $$
SELECT jsonb_agg(lower(elem))
FROM jsonb_array_elements_text(arr) elem
$$
language sql immutable;