2

I have a table with a primary key, timestamp, value, foreign key. I would like to find the nearest value to a timestamp for each of the foreign keys in a list.

Unfortunately, the gap from the given timestamp to the first value timestamp for one foreign key may be many times this gap for another FK. I've had a go doing something like this, but it only gives me one row, rather than one row per fk.

SELECT *
FROM   data_item
WHERE  fk_fc_id IN (35246,35247) 
AND    di_timestamp > '2013-11-01 00:00:00'
ORDER  BY di_timestamp ASC
LIMIT  1; 

It's like I need to take the limit on a per FK basis, but can't work out how to do that.

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633
James
  • 521
  • 1
  • 7
  • 15

2 Answers2

2

You can use window functions to achieve your goal. lag() and lead() are ones which can help you in a query like

SELECT lag(di_timestamp) OVER ordering, lead(di_timestamp) OVER ordering
  FROM data_item
 WHERE fk_fc_id IN (35246,35247)
WINDOW ordering AS (ORDER BY di_timestamp 
                    RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);

This will return the previous and the next timestamps, if there is any.

András Váczi
  • 31,778
  • 13
  • 102
  • 151
1

To get one row for each fk_fc_id listed that comes after the given timestamp and is closest to it .. best use DISTINCT ON, a Postgres-specific extension of the standard SQL DISTINCT:

SELECT DISTINCT ON (fk_fc_id)
       *
FROM   data_item
WHERE  fk_fc_id IN (35246,35247)
AND    di_timestamp > '2013-11-01 00:00:00' 
ORDER  BY fk_fc_id, di_timestamp;

If you need the result sorted by di_timestamp, wrap it in a subquery:

SELECT *
FROM  (
   SELECT DISTINCT ON (fk_fc_id)
          *
   FROM   data_item
   WHERE  fk_fc_id IN (35246,35247)
   AND    di_timestamp > '2013-11-01 00:00:00' 
   ORDER  BY fk_fc_id, di_timestamp
   ) sub
ORDER  BY di_timestamp;

Or you could use the window function row_number() (probably slower, standard SQL):

SELECT *
FROM  (
   SELECT *, row_number() OVER (PARTITION BY fk_fc_id
                                ORDER BY di_timestamp) AS rn
   FROM   data_item
   WHERE  fk_fc_id IN (35246,35247)
   AND    di_timestamp > '2013-11-01 00:00:00' 
   ORDER  BY fk_fc_id, di_timestamp;
  ) sub
WHERE  rn = 1
ORDER  BY di_timestamp;

There are more ways. More details in this related answer:
How do I efficiently get "the most recent corresponding row"?

Even more under this related question on SO:
Select first row in each GROUP BY group?

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633