In PostgreSQL, how can I make Bitmap Index Scan parallelized?

Question

I want to improve the performance of an SQL statement.

I am on version 13. Here are the sample codes and the query I am interested in.

drop table ords;
CREATE TABLE ords (
ORD_ID INT NOT NULL,
CUST_ID VARCHAR(10) NOT NULL,
ORD_DATE DATE NOT NULL,
ETC_CONTENT VARCHAR(100));
ALTER TABLE ords ADD CONSTRAINT ORDS_PK PRIMARY KEY(ORD_ID);
CREATE INDEX ORDS_X01 ON ORDS (CUST_ID);
INSERT INTO ORDS
SELECT i
      ,lpad(mod(i,1000)::text,10,'cust')
      ,date '2021-06-07'+mod(i,624)
      ,rpad('x',100,'x')
  FROM generate_series(1,1000000) a(i);
drop table delivery;
CREATE TABLE delivery (
ORD_ID INT NOT NULL,
VEHICLE_ID VARCHAR(10) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE   DATE NOT NULL,
ETC_REMARKS VARCHAR(100));
INSERT INTO DELIVERY
SELECT i
     , MOD(i,1000)
     , date '2021-01-01' + mod(i,1000)
     , date '2021-01-05' + mod(i,1000)
     , rpad('x',100,'x')
  FROM generate_series(1,1000000) a(i);
ALTER TABLE DELIVERY ADD CONSTRAINT DELIVERY_PK primary key (ORD_ID);
CREATE INDEX DELIVERY_X01 ON DELIVERY(END_DATE, START_DATE);
CREATE INDEX DELIVERY_X02 ON DELIVERY(VEHICLE_ID);
select pg_relation_size('ords'), pg_relation_size('delivery');
analyze ords;
analyze delivery;
EXPLAIN(ANALYZE, BUFFERS, COSTS OFF)
SELECT A., B.
  FROM ORDS A LEFT JOIN DELIVERY B
    ON (A.ORD_ID = B.ORD_ID
        AND (B.START_DATE <= DATE '2021-07-12' AND B.END_DATE >= DATE '2021-07-10'
             OR (B.VEHICLE_ID > '990')
             )
        )
 WHERE A.ORD_DATE BETWEEN DATE '2021-06-01' AND DATE '2021-07-10'
;

Below is the execution plan.

 Gather (actual time=86.645..101.685 rows=54501 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=13615 read=23995, temp read=1196 written=1272
   ->  Parallel Hash Left Join (actual time=83.360..87.135 rows=18167 loops=3)
         Hash Cond: (a.ord_id = b.ord_id)
         Buffers: shared hit=13614 read=23995, temp read=1196 written=1272
         ->  Parallel Seq Scan on ords a (actual time=0.047..34.335 rows=18167 loops=3)
               Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))
               Rows Removed by Filter: 315166
               Buffers: shared hit=4968 read=14263
         ->  Parallel Hash (actual time=42.999..42.999 rows=5333 loops=3)
               Buckets: 32768  Batches: 8  Memory Usage: 608kB
               Buffers: shared hit=8450 read=9732, temp written=280
               ->  Parallel Seq Scan on delivery b (actual time=0.069..40.615 rows=5333 loops=3)
                     Filter: (((start_date <= '2021-07-12'::date) AND (end_date >= '2021-07-10'::date)) OR ((vehicle_id)::text > '990'::text))
                     Rows Removed by Filter: 328000
                     Buffers: shared hit=8450 read=9732
 Planning:
   Buffers: shared hit=20
 Planning Time: 0.357 ms
 Execution Time: 103.282 ms

I had expected that two Bitmap Index Scans using the delivery_x01 and delivery_x02 would appear followed by the BitmapOr operation when fetching rows from the DELIVERY table. Unlike what I thought, the planner chose to do a table scan with the parallelism.

To compare the execution plan I expected with the plan PostgreSQL chose, I set the parameter max_parallel_workers_per_gather to 0 and re-ran the SQL statement.

set max_parallel_workers_per_gather = 0;

--I re-ran the query and here is the resulting execution plan.

Hash Right Join (actual time=100.080..119.375 rows=54501 loops=1)
   Hash Cond: (b.ord_id = a.ord_id)
   Buffers: shared hit=3304 read=18847, temp read=903 written=903
   ->  Bitmap Heap Scan on delivery b (actual time=1.374..4.277 rows=16000 loops=1)
         Recheck Cond: (((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date)) OR ((ve
hicle_id)::text > '990'::text))
         Heap Blocks: exact=2182
         Buffers: shared hit=2919
         ->  BitmapOr (actual time=1.108..1.109 rows=0 loops=1)
               Buffers: shared hit=737
               ->  Bitmap Index Scan on delivery_x01 (actual time=0.809..0.810 rows=7000 loops=1)
                     Index Cond: ((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date)
)
                     Buffers: shared hit=726
               ->  Bitmap Index Scan on delivery_x02 (actual time=0.298..0.298 rows=9000 loops=1)
                     Index Cond: ((vehicle_id)::text > '990'::text)
                     Buffers: shared hit=11
   ->  Hash (actual time=98.373..98.374 rows=54501 loops=1)
         Buckets: 32768  Batches: 4  Memory Usage: 2331kB
         Buffers: shared hit=384 read=18847, temp written=697
         ->  Seq Scan on ords a (actual time=0.122..85.072 rows=54501 loops=1)
               Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))
               Rows Removed by Filter: 945499
               Buffers: shared hit=384 read=18847
 Planning:
   Buffers: shared hit=12
 Planning Time: 0.232 ms
 Execution Time: 120.843 ms

By using Bitmap Index Scan and BitmapOr operations I could drop the number of block I/Os, but the execution time increased from 103 ms to 120 ms. It seems that the parallelism is the main factor of the execution time gap. So I infer that if parallelism kicks in in the Bitmap Index Scan operation, the query would become faster.

Finally, My question is: How can I make the Bitmap Index Scan operation parallelized?

The following is the execution plan I want to get.

Gather
  Workers Planned: 2
  Workers Launched: 2
  Parallel Hash Right Join
  ->  Hash Cond: (b.ord_id = a.ord_id)
    -> Parallel Bitmap Heap Scan on delivery
       -> BitmapOr
          -> Parallel Bitmap Index Scan on delivery_x01
          -> Parallel Bitmap INdex Scan on delivery_x02
    -> Parallel Hash
       -> Parallel Seq Scan on ords

jjanes · Answer 1 · 2021-06-06T16:03:50.537

As far as I know, there is no such thing as a Parallel Bitmap Index Scan. So in order to get it to use one of those, the first thing you would need to do is implement it. But you would need some kind of aggregator node sitting on top of it. It is not immediately clear to me how to implement that in a correct and performant way.

Now if you did implement this, it would probably not make much difference. I think the fundamental problem lies with you representing ranges as separate start and end points, rather than as actual ranges. PostgreSQL supports range types as first-class entities, and using them solves many problems.

score 0 · Answer 2 · answered Jun 06 '21 at 04:46

If I run, SET enable_seqscan=0 to force it and rerun your SELECT, I get a parallel bitmap scan.

                                                                     QUERY PLAN                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Gather (actual time=105.482..119.050 rows=54501 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=3002 read=26091 written=6, temp read=1194 written=1264
   ->  Parallel Hash Left Join (actual time=99.531..102.736 rows=18167 loops=3)
         Hash Cond: (a.ord_id = b.ord_id)
         Buffers: shared hit=3002 read=26091 written=6, temp read=1194 written=1264
         ->  Parallel Index Scan using ords_pk on ords a (actual time=0.097..65.549 rows=18167 loops=3)
               Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))
               Rows Removed by Filter: 315166
               Buffers: shared hit=2911 read=21661 written=6
         ->  Parallel Hash (actual time=29.389..29.391 rows=5333 loops=3)
               Buckets: 32768  Batches: 8  Memory Usage: 576kB
               Buffers: shared hit=7 read=4430, temp written=280
               ->  Parallel Bitmap Heap Scan on delivery b (actual time=23.930..27.802 rows=5333 loops=3)
                     Recheck Cond: (((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date)) OR ((vehicle_id)::text > '990'::text))
                     Heap Blocks: exact=973
                     Buffers: shared hit=7 read=4430
                     ->  BitmapOr (actual time=28.203..28.205 rows=0 loops=1)
                           Buffers: shared hit=3 read=2252
                           ->  Bitmap Index Scan on delivery_x01 (actual time=26.972..26.972 rows=7000 loops=1)
                                 Index Cond: ((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date))
                                 Buffers: shared read=2227
                           ->  Bitmap Index Scan on delivery_x02 (actual time=1.229..1.229 rows=9000 loops=1)
                                 Index Cond: ((vehicle_id)::text > '990'::text)
                                 Buffers: shared hit=3 read=25
 Planning Time: 0.748 ms
 Execution Time: 120.634 ms
(28 rows)

The big problem there, is it's still faster with the seqscan, albiet not by much

                                                                  QUERY PLAN                                                                   
-----------------------------------------------------------------------------------------------------------------------------------------------
 Gather (actual time=96.129..109.602 rows=54501 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   Buffers: shared hit=14282 read=23295, temp read=1199 written=1248
   ->  Parallel Hash Left Join (actual time=91.822..95.166 rows=18167 loops=3)
         Hash Cond: (a.ord_id = b.ord_id)
         Buffers: shared hit=14282 read=23295, temp read=1199 written=1248
         ->  Parallel Seq Scan on ords a (actual time=0.042..31.783 rows=18167 loops=3)
               Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))
               Rows Removed by Filter: 315166
               Buffers: shared hit=14126 read=5105
         ->  Parallel Hash (actual time=56.069..56.070 rows=5333 loops=3)
               Buckets: 32768  Batches: 8  Memory Usage: 608kB
               Buffers: shared hit=6 read=18176, temp written=256
               ->  Parallel Seq Scan on delivery b (actual time=0.098..54.431 rows=5333 loops=3)
                     Filter: (((start_date <= '2021-07-12'::date) AND (end_date >= '2021-07-10'::date)) OR ((vehicle_id)::text > '990'::text))
                     Rows Removed by Filter: 328000
                     Buffers: shared hit=6 read=18176
 Planning Time: 0.764 ms
 Execution Time: 111.118 ms
(20 rows)

So the real question is why do you want this? And given a scenario where the parallel bitmap scan was faster, would PostgreSQL not naturally choose it?

In PostgreSQL, how can I make Bitmap Index Scan parallelized?

2 Answers2