I want to improve the performance of an SQL statement.
I am on version 13. Here are the sample codes and the query I am interested in.
drop table ords;
CREATE TABLE ords (
ORD_ID INT NOT NULL,
CUST_ID VARCHAR(10) NOT NULL,
ORD_DATE DATE NOT NULL,
ETC_CONTENT VARCHAR(100));
ALTER TABLE ords ADD CONSTRAINT ORDS_PK PRIMARY KEY(ORD_ID);
CREATE INDEX ORDS_X01 ON ORDS (CUST_ID);
INSERT INTO ORDS
SELECT i
,lpad(mod(i,1000)::text,10,'cust')
,date '2021-06-07'+mod(i,624)
,rpad('x',100,'x')
FROM generate_series(1,1000000) a(i);
drop table delivery;
CREATE TABLE delivery (
ORD_ID INT NOT NULL,
VEHICLE_ID VARCHAR(10) NOT NULL,
START_DATE DATE NOT NULL,
END_DATE DATE NOT NULL,
ETC_REMARKS VARCHAR(100));
INSERT INTO DELIVERY
SELECT i
, MOD(i,1000)
, date '2021-01-01' + mod(i,1000)
, date '2021-01-05' + mod(i,1000)
, rpad('x',100,'x')
FROM generate_series(1,1000000) a(i);
ALTER TABLE DELIVERY ADD CONSTRAINT DELIVERY_PK primary key (ORD_ID);
CREATE INDEX DELIVERY_X01 ON DELIVERY(END_DATE, START_DATE);
CREATE INDEX DELIVERY_X02 ON DELIVERY(VEHICLE_ID);
select pg_relation_size('ords'), pg_relation_size('delivery');
analyze ords;
analyze delivery;
EXPLAIN(ANALYZE, BUFFERS, COSTS OFF)
SELECT A., B.
FROM ORDS A LEFT JOIN DELIVERY B
ON (A.ORD_ID = B.ORD_ID
AND (B.START_DATE <= DATE '2021-07-12' AND B.END_DATE >= DATE '2021-07-10'
OR (B.VEHICLE_ID > '990')
)
)
WHERE A.ORD_DATE BETWEEN DATE '2021-06-01' AND DATE '2021-07-10'
;
Below is the execution plan.
Gather (actual time=86.645..101.685 rows=54501 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=13615 read=23995, temp read=1196 written=1272
-> Parallel Hash Left Join (actual time=83.360..87.135 rows=18167 loops=3)
Hash Cond: (a.ord_id = b.ord_id)
Buffers: shared hit=13614 read=23995, temp read=1196 written=1272
-> Parallel Seq Scan on ords a (actual time=0.047..34.335 rows=18167 loops=3)
Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))
Rows Removed by Filter: 315166
Buffers: shared hit=4968 read=14263
-> Parallel Hash (actual time=42.999..42.999 rows=5333 loops=3)
Buckets: 32768 Batches: 8 Memory Usage: 608kB
Buffers: shared hit=8450 read=9732, temp written=280
-> Parallel Seq Scan on delivery b (actual time=0.069..40.615 rows=5333 loops=3)
Filter: (((start_date <= '2021-07-12'::date) AND (end_date >= '2021-07-10'::date)) OR ((vehicle_id)::text > '990'::text))
Rows Removed by Filter: 328000
Buffers: shared hit=8450 read=9732
Planning:
Buffers: shared hit=20
Planning Time: 0.357 ms
Execution Time: 103.282 ms
I had expected that two Bitmap Index Scans using the delivery_x01 and delivery_x02 would appear followed by the BitmapOr operation when fetching rows from the DELIVERY table. Unlike what I thought, the planner chose to do a table scan with the parallelism.
To compare the execution plan I expected with the plan PostgreSQL chose, I set the parameter max_parallel_workers_per_gather to 0 and re-ran the SQL statement.
set max_parallel_workers_per_gather = 0;
--I re-ran the query and here is the resulting execution plan.
Hash Right Join (actual time=100.080..119.375 rows=54501 loops=1)
Hash Cond: (b.ord_id = a.ord_id)
Buffers: shared hit=3304 read=18847, temp read=903 written=903
-> Bitmap Heap Scan on delivery b (actual time=1.374..4.277 rows=16000 loops=1)
Recheck Cond: (((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date)) OR ((ve
hicle_id)::text > '990'::text))
Heap Blocks: exact=2182
Buffers: shared hit=2919
-> BitmapOr (actual time=1.108..1.109 rows=0 loops=1)
Buffers: shared hit=737
-> Bitmap Index Scan on delivery_x01 (actual time=0.809..0.810 rows=7000 loops=1)
Index Cond: ((end_date >= '2021-07-10'::date) AND (start_date <= '2021-07-12'::date)
)
Buffers: shared hit=726
-> Bitmap Index Scan on delivery_x02 (actual time=0.298..0.298 rows=9000 loops=1)
Index Cond: ((vehicle_id)::text > '990'::text)
Buffers: shared hit=11
-> Hash (actual time=98.373..98.374 rows=54501 loops=1)
Buckets: 32768 Batches: 4 Memory Usage: 2331kB
Buffers: shared hit=384 read=18847, temp written=697
-> Seq Scan on ords a (actual time=0.122..85.072 rows=54501 loops=1)
Filter: ((ord_date >= '2021-06-01'::date) AND (ord_date <= '2021-07-10'::date))
Rows Removed by Filter: 945499
Buffers: shared hit=384 read=18847
Planning:
Buffers: shared hit=12
Planning Time: 0.232 ms
Execution Time: 120.843 ms
By using Bitmap Index Scan and BitmapOr operations I could drop the number of block I/Os, but the execution time increased from 103 ms to 120 ms. It seems that the parallelism is the main factor of the execution time gap. So I infer that if parallelism kicks in in the Bitmap Index Scan operation, the query would become faster.
Finally, My question is: How can I make the Bitmap Index Scan operation parallelized?
The following is the execution plan I want to get.
Gather
Workers Planned: 2
Workers Launched: 2
Parallel Hash Right Join
-> Hash Cond: (b.ord_id = a.ord_id)
-> Parallel Bitmap Heap Scan on delivery
-> BitmapOr
-> Parallel Bitmap Index Scan on delivery_x01
-> Parallel Bitmap INdex Scan on delivery_x02
-> Parallel Hash
-> Parallel Seq Scan on ords