1

Getting the second highest value from a table has been solved many times, but I'm looking for the second highest value in each group.

Given this table:

+----+-----+
| A  |  10 |
| A  |  20 |
| A  |  35 |  <-- This record
| A  |  42 |
| B  |  12 |
| B  |  21 |  <-- This record
| B  |  33 |
| C  |  14 |
| C  |  23 |
| C  |  38 |
| C  |  41 |  <-- This record
| C  |  55 |
+----+-----+

I'd like to get the marked rows.
Pseudo-code:

select col_a, penultimate(col_b)
from foo
group by col_a;
RonJohn
  • 694
  • 2
  • 12
  • 31

2 Answers2

6

You can use window functions for this.

select col_a, col_b
from (
  select col_a, 
         col_b, 
         dense_rank() over (partition by col_a order by col_b desc) as rnk
  from the_table
) t
where rnk = 2
4

Assuming distinct values per group. So we need not break ties.
Assuming at least 2 rows per group - or the following query breaks. (You'd need to do more, starting by defining the "2nd highest value" for those cases.)

With more than a few rows per group, (and while that feature is not implemented directly, yet, as of pg 14) an emulated index skip scan will be (much) faster. Slightly tricky for taking the second highest value:

WITH RECURSIVE cte AS (
   (
   SELECT col_a, col_b
   FROM   tbl
   ORDER  BY col_a, col_b DESC
   OFFSET 1
   LIMIT  1
   )
   UNION ALL
   (
   SELECT t.col_a, t.col_b
   FROM   cte c
   JOIN   tbl t ON t.col_a > c.col_a
   ORDER  BY t.col_a, t.col_b DESC
   OFFSET 1
   LIMIT  1
   )
   )
TABLE cte;

db<>fiddle here

See:

Requires an applicable index to be fast. Like:

CREATE UNIQUE INDEX ON tbl (col_a, col_b DESC);

Postgres can scan an index backwards at practically full speed. But the combined sort order of columns cannot disagree with the query. See:

Erwin Brandstetter
  • 185,527
  • 28
  • 463
  • 633