I'm using PostgreSQL 11. I have a table with ~500к records and its size is around 2GB. If I'm running this query:
SELECT DISTINCT point.id
FROM "buildings_point" as point
LEFT OUTER JOIN "equipment_equipment" as equipment ON (point.equipment_id = equipment.id)
WHERE (
equipment.id = 4
OR point.id = 4
);
And I do have both indexes in the buildings_point table:
"buildings_point_pkey" PRIMARY KEY, btree (id)
"buildings_point_997b9956" btree (equipment_id)
For the equipment_equipment table I do have index by id:
"equipment_equipment_pkey" PRIMARY KEY, btree (id)
But the scheduler ran:
Parallel Seq Scan on buildings_point point
which took about 18 sec to walk through all 500к records. If I remove any of equipment.id = 4 or point.id = 4 query will take around 5 ms. Why does the scheduler not use both indexes?
Explain analyze output:
-> Sort (cost=175133.54..175133.59 rows=19 width=4) (actual time=19163.682..19181.748 rows=3 loops=1)
Sort Key: point.id
Sort Method: quicksort Memory: 25kB
-> Gather (cost=2307.62..175133.14 rows=19 width=4) (actual time=3958.040..19181.729 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Hash Left Join (cost=1307.62..174131.24 rows=8 width=4) (actual time=8205.581..19155.693 rows=1 loops=3)
Hash Cond: (point.equipment_id = equipment.id)
Filter: ((equipment.id = 4) OR (point.id = 4))
Rows Removed by Filter: 218243
-> Parallel Seq Scan on buildings_point point (cost=0.00..172111.33 rows=271333 width=8) (actual time=4.661..18838.772 rows=218244 loops=3)
-> Parallel Hash (cost=1116.56..1116.56 rows=15285 width=4) (actual time=52.361..52.364 rows=12229 loops=3)
Buckets: 65536 Batches: 1 Memory Usage: 1984kB
-> Parallel Index Only Scan using buildings_equipment_pkey on equipment_equipment equipment (cost=0.29..1116.56 rows=15285 width=4) (actual time=0.848..34.102 rows=12229 loops=3)
Heap Fetches: 176
Planning Time: 0.275 ms
Execution Time: 19182.162 ms