0

I have a postgresql measurement table which is partitioned based on logdate column by month. Suppose I have city_id integer column and want to select latest row with city_id = 125.

When I run explain select * from measurement where city_id = 125 limit 1; I get

Limit  (cost=0.00..3.39 rows=1 width=12)
  ->  Append  (cost=0.00..308.25 rows=91 width=12)
     ->  Seq Scan on measurement  (cost=0.00..0.00 rows=1 width=12)
           Filter: (city_id = 125)
     ->  Seq Scan on measurement_y2006m01  (cost=0.00..34.25 rows=10 width=12)
           Filter: (city_id = 125)
     ->  Seq Scan on measurement_y2006m02  (cost=0.00..34.25 rows=10 width=12)
           Filter: (city_id = 125)
     ->  Seq Scan on measurement_y2006m03  (cost=0.00..34.25 rows=10 width=12)
           Filter: (city_id = 125)

What is the default order in which those Seq Scan actions are performed on inherited tables? And can I explicitly set it so that the latest tables are scanned first as I am expecting to select recently added rows more often?

lompy
  • 1

1 Answers1

1

I ... want to select latest row with city_id = 125.

Bold emphasis mine. Your query does not match the declared objective. Should be something like:

SELECT * FROM measurement WHERE city_id = 125 ORDER BY logdate DESC LIMIT  1;

Which should make Postgres scan the latest partition first.

Assuming logdate is defined NOT NULL. Else use ORDER BY logdate DESC NULLS LAST:

If that doesn't work for some reason (your setup and Postgres version are undisclosed), you can also spell it out:

(SELECT * FROM  measurement_y2006m03 WHERE city_id = 125 ORDER BY logdate DESC)
UNION ALL
(SELECT * FROM  measurement_y2006m02 WHERE city_id = 125 ORDER BY logdate DESC)
UNION ALL
(SELECT * FROM  measurement_y2006m01 WHERE city_id = 125 ORDER BY logdate DESC)
LIMIT  1;

Parentheses are required to make the ORDER BY apply to each SELECT instead of the whole query. Maybe you get an even better query plan by adding LIMIT 1 to each SELECT additionally. You'll have to test.

Postgres will stop scanning as soon as it has found the first row. Related answers with more explanation:

Index

For best performance you would have composite indices on (city_id, logdate DESC) on all child tables (or at least the latest ones).

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633