Simple query with two WHERE clauses uses inferior index scan

Question

I have a table named "db_log" with a size of 11GB and approximately 10 million rows in PostgreSQL 13.7 version. The issue is that a simple query is not performing an full index scan for every column and takes around 15 seconds to complete. Here are the results when running the query using EXPLAIN ANALYZE:

EXPLAIN ANALYZE SELECT id, entity_id, request_json FROM db_log dl WHERE status = 'FAILED' AND db_type = 'SOME_VALUE' ORDER BY id DESC LIMIT 1;
                                  QUERY PLAN


Limit  (cost=0.43..1323.96 rows=1 width=673) (actual time=16343.062..16343.062 rows=0 loops=1)
   ->  Index Scan Backward using db_log_pkey on db_log dl  (cost=0.43..1915136.79 rows=1447 width=673) (actual time=16343.061..16343.061 rows=0 loops=1)
         Filter: (((status)::text = 'FAILED'::text) AND ((db_type)::text = 'SSM_SEND_XML_SKRIT'::text))
         Rows Removed by Filter: 9471684
 Planning Time: 0.429 ms
 Execution Time: 16343.079 ms
(6 rows)

I have already created an index using the following query:

CREATE INDEX idx_combine_status_dbtype_id ON public.db_log USING btree (status, db_type, id);

DDL for that table:

CREATE TABLE public.db_log (
    id bigserial NOT NULL,
    db_type varchar(30) NULL DEFAULT 'NULL::character varying'::character varying,
    entity_id varchar(200) NULL DEFAULT NULL::character varying,
    request_json text NULL,
    response_json text NULL,
    time_submitted timestamp NULL,
    time_request timestamp NULL,
    time_response timestamp NULL,
    created_by int4 NULL,
    created_date timestamp NULL,
    updated_by int4 NULL,
    updated_date timestamp NULL,
    status varchar(20) NULL DEFAULT 'NULL::character varying'::character varying,
    url_db varchar(200) NULL,
    CONSTRAINT db_log_pkey PRIMARY KEY (id)
);
CREATE INDEX combine_status_dbtype ON public.db_log USING btree (status, db_type);
CREATE INDEX combine_dbtype_entityid ON public.db_log USING btree (db_type, entity_id);
CREATE INDEX idx_combine_status_dbtype_id ON public.db_log USING btree (status, db_type, id);
CREATE INDEX idx_db_log_entity_id ON public.db_log USING btree (entity_id);
CREATE INDEX idx_db_log_status ON public.db_log USING btree (status);
CREATE INDEX idx_db_log_time_request ON public.db_log USING btree (time_request);
CREATE INDEX idx_db_log_time_response ON public.db_log USING btree (time_response);
CREATE INDEX idx_db_log_time_submitted ON public.db_log USING btree (time_submitted);
CREATE INDEX idx_db_log_db_type ON public.db_log USING btree (db_type);

What is the appropriate solution to address this issue?

Erwin Brandstetter · Answer 1 · 2023-05-20T07:10:31.860

Basic problem

Postgres walks the index (backwards) on the primary key column (id) expecting to find one (!) qualifying row soon enough. Turns out, it has to skip close to 10 million rows until it finally does:

Rows Removed by Filter: 9471684

Basic solutions

There are various ways to improve performance. Maybe column statistics are just outdated and a plain

ANALYZE db_log;

fixes the problem. You may want more aggressive autovacuum settings (for this table only) to keep it that way. Maybe increasing the statistics target for the filter columns helps:

ALTER TABLE db_log
  ALTER COLUMN status SET STATISTICS 2000  -- or similar
, ALTER COLUMN db_type SET STATISTICS 2000;  -- or similar

(And then run ANALYZE.)

Maybe (also) some cost settings are inadequate ...

Basically, the index you created on (status, db_type, id) is applicable to your query. status and db_type are varchar(20) and varchar(30) (you might have that cheaper to begin with ...) So the index is much bigger than the one (id). So if Postgres expects to find a qualifying row soon enough it may still choose the plan we see.

Suspected specific problem

I suspect the core problem is this: You have some types of databases (db_type) that fail often, and others rarely do. Postgres by default only gathers per-column statistics. It finds that status = 'FAILED' is rather common overall, and it finds that db_type = 'SOME_VALUE' is also rather common. It combines the selectivity of both predicates and reckons walking the index on (id) - which meets your ORDER BY - will find a qualifying row soon enough to be the best plan.

What Postgres does not realize is that the combination is not common at all. The given type of DB actually never (or hardly ever) fails. So Postgres ends up walking (almost) the whole index, which turns out to be a worst-case plan.

Solution: "multivariate statistics"

Postgres 10 added "multivariate statistics". In particular, you need multivariate MCV lists:

CREATE STATISTICS db_log_status_db_type (mcv)
ON status, db_type FROM db_log;

The next ANALYZE will gather the most common combination of status and db_type (and their frequency). This should enable Postgres to realize that it won't find a matching row for the given combination any time soon, and prefer a different query plan.

Partial index

If you run this kind of query often, and one or more of the filters are selective and immutable, consider a partial index like:

CREATE INDEX ON db_log (id) WHERE status = 'FAILED';

For a much smaller index, less maintenance cost, and faster query time.

Brute force

If all else fails (which should not be the case), one brute force method is to "disable" the index on (id) by obfuscating the ORDER BY:

ORDER BY id + 0 DESC

The no-op id + 0 hides from Postgres that the index on (id) would be applicable, and it will take the next best option ...

Rick James · Answer 2 · 2023-05-27T15:53:26.223

1

When you have INDEX(a,b), you don't also need INDEX(a). So get rid of some of your indexes. This may let the Optimizer pick the 'best' index.

edited May 27 '23 at 15:53

answered May 23 '23 at 05:29

Rick James

80,479
5
52
119