Nearest value for each foreign key

Question

I have a table with a primary key, timestamp, value, foreign key. I would like to find the nearest value to a timestamp for each of the foreign keys in a list.

Unfortunately, the gap from the given timestamp to the first value timestamp for one foreign key may be many times this gap for another FK. I've had a go doing something like this, but it only gives me one row, rather than one row per fk.

SELECT *
FROM   data_item
WHERE  fk_fc_id IN (35246,35247) 
AND    di_timestamp > '2013-11-01 00:00:00'
ORDER  BY di_timestamp ASC
LIMIT  1;

It's like I need to take the limit on a per FK basis, but can't work out how to do that.

András Váczi · Answer 1 · 2017-12-20T13:44:59.083

You can use window functions to achieve your goal. lag() and lead() are ones which can help you in a query like

SELECT lag(di_timestamp) OVER ordering, lead(di_timestamp) OVER ordering
  FROM data_item
 WHERE fk_fc_id IN (35246,35247)
WINDOW ordering AS (ORDER BY di_timestamp 
                    RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);

This will return the previous and the next timestamps, if there is any.

score 1 · Answer 2 · edited May 23 '17 at 12:40

To get one row for each fk_fc_id listed that comes after the given timestamp and is closest to it .. best use DISTINCT ON, a Postgres-specific extension of the standard SQL DISTINCT:

SELECT DISTINCT ON (fk_fc_id)
       *
FROM   data_item
WHERE  fk_fc_id IN (35246,35247)
AND    di_timestamp > '2013-11-01 00:00:00' 
ORDER  BY fk_fc_id, di_timestamp;

If you need the result sorted by di_timestamp, wrap it in a subquery:

SELECT *
FROM  (
   SELECT DISTINCT ON (fk_fc_id)
          *
   FROM   data_item
   WHERE  fk_fc_id IN (35246,35247)
   AND    di_timestamp > '2013-11-01 00:00:00' 
   ORDER  BY fk_fc_id, di_timestamp
   ) sub
ORDER  BY di_timestamp;

Or you could use the window function row_number() (probably slower, standard SQL):

SELECT *
FROM  (
   SELECT *, row_number() OVER (PARTITION BY fk_fc_id
                                ORDER BY di_timestamp) AS rn
   FROM   data_item
   WHERE  fk_fc_id IN (35246,35247)
   AND    di_timestamp > '2013-11-01 00:00:00' 
   ORDER  BY fk_fc_id, di_timestamp;
  ) sub
WHERE  rn = 1
ORDER  BY di_timestamp;

There are more ways. More details in this related answer:
How do I efficiently get "the most recent corresponding row"?

Even more under this related question on SO:
Select first row in each GROUP BY group?

Nearest value for each foreign key

2 Answers2