1

I pinpointed performance issues in queries to this exact bad estimate:

EXPLAIN ANALYSE
SELECT * FROM form_instances WHERE form_instances.made_at >= '2024-01-01 00:00:00' AND form_id = 607;
Index Scan using form_instances_made_at_form_id_index on form_instances  (cost=0.42..15.79 rows=1 width=78) (actual time=0.026..0.056 rows=86 loops=1)
  Index Cond: ((made_at >= '2024-01-01 00:00:00+00'::timestamp with time zone) AND (form_id = 607))
Planning Time: 1.579 ms
Execution Time: 0.086 ms

I tried increasing the statistics target and adding extended statistics to no success.

CREATE STATISTICS form_instances_made_at_form_id ON made_at, form_id FROM public.form_instances;
ALTER STATISTICS form_instances_made_at_form_id SET STATISTICS 10000;
ALTER TABLE form_instances ALTER COLUMN form_id SET statistics 10000;
ALTER TABLE form_instances ALTER COLUMN made_at SET statistics 10000;
ANALYSE form_instances;

It would appear that the issue is that nearly all the rows have have their made before 2024-01-01 00:00:00. What can I do to fix these estimates ?

form_instances schema:

create table form_instances
(
    uuid         uuid                     default uuid_generate_v4() not null
        primary key,
    form_id      integer                                             not null
        references forms,
    made_at      timestamp with time zone,
    created_at   timestamp with time zone default now()              not null,
    created_by   integer                                             not null
        references "Users",
    deleted_at   timestamp with time zone,
    company_id   integer                                             not null
        constraint "form_instances_ClientCompanies_id_fk"
            references "Companies",
    progression  smallint                 default 0                  not null,
    started_at   timestamp with time zone,
    linked_event uuid
        references "Events" (uuid)
);

form_instances_made_at_form_id_index definition:

create index form_instances_made_at_form_id_index
    on form_instances (made_at desc, form_id asc);

output of SELECT attnames, kinds, n_distinct, dependencies FROM pg_stats_ext WHERE statistics_name = 'form_instances_made_at_form_id';

+-----------------+-------+---------------+----------------------------------------+
|attnames         |kinds  |n_distinct     |dependencies                            |
+-----------------+-------+---------------+----------------------------------------+
|{form_id,made_at}|{d,f,m}|{"2, 3": 98487}|{"2 => 3": 0.000242, "3 => 2": 0.990116}|
+-----------------+-------+---------------+----------------------------------------+

output of SELECT attname, null_frac, n_distinct, correlation FROM pg_stats WHERE tablename = 'form_instances';

+------------+----------+--------------+------------+
|attname     |null_frac |n_distinct    |correlation |
+------------+----------+--------------+------------+
|uuid        |0         |-1            |-0.004722572|
|form_id     |0         |472           |0.881484    |
|made_at     |0         |-0.989652     |0.1547967   |
|created_at  |0         |1213          |0.89003974  |
|created_by  |0         |337           |0.12023027  |
|deleted_at  |0.99643624|78            |0.94211787  |
|company_id  |0         |70            |0.36659256  |
|progression |0         |42            |0.991062    |
|started_at  |0.99340755|-0.0065924525 |0.9549464   |
|linked_event|0.9998082 |-0.00019180775|0.05263158  |
+------------+----------+--------------+------------+

0 Answers0