12

I have a table which is storing reservation data using the columns starts_at & ends_at Whenever I'm querying the table to find overlapping reservations, I have an option of using one of the following queries:

SELECT * FROM reservations
WHERE starts_at < '2014-01-03 00:00:00'
AND   ends_at   >='2014-01-01 00:00:00';

Or

SELECT * FROM reservations
WHERE tsrange(starts_at, ends_at) && ('2014-01-01 00:00:00', '2014-01-03 00:00:00')

I have regular B-Tree indices on the starts_at and ends_at columns, therefore the first query is always using them. However, unless I define a functional GiST index on the tsrange, the second query does a complete scan.

create index tsrange_idx on reservations using gist(tsrange(starts_at, ends_at)); 

My question is, as the table grows, which index is going to be faster? Probably the answer is obvious looking at the query execution plan, but I'm not well-versed with reading EXPLAIN ANALYZE output.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
Saurabh Nanda
  • 333
  • 1
  • 4
  • 16

1 Answers1

20

Timestamps with B-tree index

I suggest a third option: as long as your table holds two timestamp columns (which seem to be defined NOT NULL) I would use a single multicolumn B-tree index with opposed sort order (if no other considerations apply):

CREATE INDEX reservations_range_idx ON reservations (starts_at, ends_at DESC);

More in these related answer:

As for the query, have a look at the SQL-standard operator OVERLAPS:

SELECT * FROM reservations
WHERE (starts_at, ends_at) OVERLAPS ('2014-01-01 00:00:00', '2014-01-03 00:00:00');

More in this related question on SO:

Should be faster than two B-tree indices. Less disk space and cheaper to maintain. The burden on write operations is small.

Range type with GiST index

With big tables, a GiST index on a range type is most probably faster, because it scales better. Storage on disk is considerably bigger and index maintenance a bit more expensive, though.

If you go that route, it would be more efficient to store your timestamps as range (tsrange or tstzrange) to begin with. A plain GiST index without the functional aspect is a bit faster.

CREATE TABLE reservation (
  reservation_id serial PRIMARY KEY
, span tsrange
, ...
);

CREATE INDEX reservation_span_gist_idx on reservations USING GiST (span);

With the && "overlap" operator you already displayed in the question:

SELECT *
FROM   reservation 
WHERE  span && ('2014-01-01 00:00:00', '2014-01-03 00:00:00');

You might be interested in an exclusion constraint to rule out overlaps by design, which implements a GiST index like the one above automatically. There is a code example in the manual. This related answer on SO has more details:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633