3

Introduction

I have a PostgreSQL table setup as a queue/event-source.

I would very much like to keep the "order" of the events (even after the queue item has been processed) as a source for e2e testing.

I starting to run into query performance slow-downs (probably because of table bloat) and I don't know how to effectively query a table on a changing key.

Initial Setup

Postgres: v15

Table DDL

CREATE TABLE eventsource.events (
    id serial4 NOT NULL,
    message jsonb NOT NULL,
    status varchar(50) NOT NULL,
    createdOn timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT events_pkey PRIMARY KEY (id)
);
CREATE INDEX ON eventsource.events (createdOn)

Scrape Query (Pseudo Code)

BEGIN;  -- Start transaction

SELECT message, status FROM eventsource.events ee WHERE status = 'PENDING' ORDER BY ee.createdOn ASC FOR UPDATE SKIP LOCKED LIMIT 10; -- Get the OLDEST 10 events that are pending -- I found that having a batch of work items was more performant than taking 1 at a time.

... -- The application then uses the entries as tickets for doing work as in "I am working on these 10 items, no one else touch" ... UPDATE ONLY eventsource.events SET status = 'DONE' WHERE id = $id_1 UPDATE ONLY eventsource.events SET status = 'DONE' WHERE id = $id_2 UPDATE ONLY eventsource.events SET status = 'FAIL' WHERE id = $id_3 UPDATE ONLY eventsource.events SET status = 'DONE' WHERE id = $id_n ... END; -- finish transaction

Rough Worker outline

Multiple workers taking batches of work items form the queue then actioning them and reporting their statuses. I want to have as little overlap as possible.

rough work to queue sketch

Assessment

When looking at the execution plan it looks like the query has to traverse the entire table to get the records that are in 'PENDING' status.

I thought this might be because of the ORDER BY ee.createdOn ASC at first. But after reviewing the execution plan I saw that the query was traversing the entire table searching for the status, and only THEN ordering it.

Attempt

I saw partial indexes and hoped it could reduce the search space of the queries.

CREATE INDEX ON eventsource.events (status)
WHERE status = 'PENDING'

But I think I made it worse ...

Records are being inserted with the 'PENDING' status and then almost immediately changed to 'DONE' (or 'FAIL') as the application is consuming the queue. I think this might be destroying the index every time and then recreating it from scratch after the update to the status field (probably very expensive).

Question

What is the effect of updating a partial-index's key / predicate (and if significant) how do I effectively filter a big table on a changing key?

Index Approach

Is my index approach sound?

My first thought was Indexes but maybe partitions would be better suited here?
What happens if the partition key gets changed?
Is it just as destructive as destroying the index?

Index type

I know the default index type is a B-Tree, would a HASH index (or other) be better in this situation?

Under the hood, would changing the index key of a HASH index, result in destroying/recreating the index table the same way it does with a B-Tree?

Index creation

I am unsure what the effect is of the partial index's key vs predicate. What is the effective difference in indexing between:

CREATE INDEX ON eventsource.events (status)
WHERE status = 'PENDING'

and

CREATE INDEX ON eventsource.events (createdOn)
WHERE status = 'PENDING'

Here I am using createdOn because it is in my scrape query but I think id would work too.

Would moving the index key to a different field effect the index creation/recreation? In this instance I moved it from the status field (which will change) to the createdOn field, which won't. I don't quite understand what this SO implies.
And the Postgres docs are a little unclear to me about this type of partial index.

WesAtWork
  • 143
  • 5

3 Answers3

5

Don't use timestamp (without time zone)

Your whole setup is prone to failure:

CREATE TABLE eventsource.events (
    ...
    createdOn timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP -- !
    ..

CURRENT_TIMESTAMP (a.k.a. now()) returns timestamptz, not timestamp.

If by chance, accident or malicious intent any session ever sets a different timezone and then inserts a row relying on the column default, you get a different (wrong) local time, breaking sort order. And you'd have a hard time finding out why. Don't do this. Especially not with such a column default. (LOCALTIMESTAMP runs into the same problem: also depends on the current timezone setting.)

Related:

Better table definition

CREATE TABLE eventsource.event (
  event_id    integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY
, message     jsonb NOT NULL
, status      text NOT NULL CHECK (status = ANY ('{PENDING,DONE,FAIL}'::text[]))  -- more?
, created_on  timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP  -- !!!
);

Use legal, lower-case identifiers if at all possible. See:

Use text and add a CHECK constraint to enforce legal states.

IDENTITY is preferable over serial in modern Postgres. See:

Most importantly, use timestamptz as explained at the top. All other points are mere recommendations.

Better index

Use a partial index, as already suggested by Charlieface:

CREATE INDEX ON eventsource.event (created)
WHERE status = 'PENDING';

It's radically smaller for your use case and provides sorted rows. The small index is also cheaper to maintain. However, there will be a lot of churn, so the index will bloat quickly. See:

Consider aggressive autovacuum settings for the table. Like:

ALTER TABLE eventsource.event SET (autovacuum_vacuum_scale_factor = 0.03);

The global default for autovacuum_vacuum_scale_factor is 0.2. Meaning, autovacuum is triggered after 20 % of the table rows + autovacuum_vacuum_threshold (50 by default) have been changed. If the table is big, that may be too lazy for your purpose. Find your balance between increased maintenance cost and improved query performance.

You may or may not need a full index on (created_on) additionally for other purposes.

Better approach

Assuming:

  • Current Postgres 15.
  • There can be concurrent writes (and/or concurrent locks).
  • You want to process the oldest row that has not been processed, yet. (And is not being processed concurrently by another session.)
  • The application process succeeds in the majority of cases.
BEGIN;  -- !!!

UPDATE eventsource.event SET status = 'DONE' WHERE event_id = ( SELECT event_id FROM eventsource.event WHERE status = 'PENDING' ORDER BY created_on LIMIT 1 FOR UPDATE SKIP LOCKED -- !!! ) RETURNING *; -- or just what you need!

-- The application then processes the entries returned by the query and will then update them

-- ONLY in case of a failure !!! -- Else just skip this: UPDATE eventsource.event SET status = 'FAIL' WHERE event_id = $id_3; -- your failed ID

COMMIT;

This approach works reliably under concurrent write load, and never blocks. It only locks a single row per session, minimizing chances for complications. It locks and updates the row immediately, which is faster than locking and updating later. In the rare case of a failure, you need a second update. But that's cheap in comparison.

If you need to (or just want to) lock and process multiple rows, that works in similar fashion. See:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
1
  • Your attempted index
    CREATE INDEX ON eventsource.events (status)
    WHERE status = 'PENDING'
    
    does not have createdon in it, so it can't help the query very much. Even if the index is used, the server would still need to sort all the PENDING rows in order of createdon.

  • My first thought was Indexes but maybe partitions would be better suited here?

    Partitioning is irrelevant here. The right index is what you want.


  • I know the default index type is a B-Tree, would a HASH index (or other) be better in this situation? No, you need the results sorted by createdon to get this query to perform. Hash indexes do not give results in order.


  • The final attempt
    CREATE INDEX ON eventsource.events (createdOn)
    WHERE status = 'PENDING'
    
    is a good start. But if you want it to cover the query, you need to add INCLUDE columns. Given the width of the table, it may be worth just forgoing that and relying on a bitmap scan. You may want to limit the columns in the SELECT to avoid this.

Having said all that, it's unclear what "processing" your application does, and whether the whole thing could be done in a single UPDATE statement anyway.

Charlieface
  • 17,078
  • 22
  • 44
0
  • Add

    CREATE INDEX ON eventsource.events (status, createdOn)
    
  • If workers can dispatch the tasks faster than the the queueing mechanism can handle the query, abandon the Queue. "Don't queue it, just do it."

  • When you assign tasks in a batch, or if the batch is 1 and the task takes a long time:

  1. Have a transaction (with locking) to grab the task(s) and assign them to the worker. (This means adding a "worker_id" column.) Do not perform the task while in this transaction.
  2. When finishing one task, release the task from the worker. (Eg, set worker_id = NULL.)
  3. Have a job periodically checking for orphaned tasks. This can happen if a worker dies without releasing the task.
Rick James
  • 80,479
  • 5
  • 52
  • 119