Fastest way to find (rare) rows with a null value among millions of rows?

Question

If I have an addresses table with ~ 4 million rows in a Postgres database. Is there a way to optimize the following query so that I can find the few rows where address_id is NULL? Sometimes I need to write these queries to evaluate the state of the database. I'm pretty new to optimizing SQL, I'm used to working with much smaller databases.

SELECT address_id FROM addresses WHERE address_id IS NULL;

score 2 · Accepted Answer · edited Apr 13 '17 at 12:43

There is a very fast way, create a partial index:

CREATE INDEX foo_idx ON addresses (address_id)  -- arbitrary pick
WHERE address_id IS NULL;

The indexed column is largely irrelevant. Use any small column(s) that might be relevant to your queries. The important point is that only the few rows with address_id IS NULL end up in the index.

The index incurs a (very small) cost on write performance, obviously.

Very large btree index with few rows (openstreetmap gis data)

Fastest way to find (rare) rows with a null value among millions of rows?

1 Answers1