3

I am modelling an application where events for factories can be registered. There are usually a small amount of factories (around 500), but each factory can register a large set of events (millions). It's impossible that two events overlap for the same factory.

My idea was to introduce an exclusion constraint based on a GIST index:

CREATE TABLE event
(
   identifier SERIAL NOT NULL PRIMARY KEY
   factory_identifier INTEGER NOT NULL REFERENCES factory,
   period TSTZRANGE NOT NULL,

EXCLUDE USING gist ( factory_identifier WITH =, period WITH && ) );

In the documentation I read:

A GiST index will be relatively ineffective if its first column has only a few distinct values, even if there are many distinct values in additional columns.

I am not sure how to interpret this. Given that there are only a small amount of factories but a large number of events, should I define the exclusions constraint as following instead?

EXCLUDE USING gist
(
   period             WITH &&
   factory_identifier WITH =,
)

Now, the period is the first column, for which there are a lot of distinct values (almost all values are unique, actually), in contrast to factory_identifier, for which there only a few distinct values.

Is this better now? In short, I have difficulties to grasp the quoted documentation above.


additional

In the slides on a presentation of exclusion constraints on page 16 and a blog post, both coming from the original author of exclusion constraints, I found the following example:

EXCLUDE USING GIST(room WITH =, during WITH &&)

Given the (reasonable) assumption that there much more distinct reservation periods than distinct rooms, I wonder if this should have been (during WITH &&, room WITH =) instead, given the quote regarding GIST column ordering above.

This makes me believe that I am not really understanding the quoted documented above.

ItIsJustMe
  • 33
  • 3

0 Answers0