18

Consider a table that records visits

create table visits (
  person varchar(10),
  ts timestamp, 
  somevalue varchar(10) 
)

Consider this example data (timestamp simplified as counter)

ts| person    |  somevalue
-------------------------
1 |  bob      |null
2 |  bob      |null
3 |  jim      |null
4 |  bob      |  A
5 |  bob      | null
6 |  bob      |  B
7 |  jim      |  X
8 |  jim      |  Y
9 |  jim      |  null

I'm trying to carry forward the last non-null somevalue of the person to all his future visits until that value changes (ie becomes the next non-null) value.

Expected result set looks like this:

ts|  person   | somevalue | carry-forward 
-----------------------------------------------
1 |  bob      |null       |   null
2 |  bob      |null       |   null
3 |  jim      |null       |   null
4 |  bob      |  A        |    A
5 |  bob      | null      |    A
6 |  bob      |  B        |    B
7 |  jim      |  X        |    X
8 |  jim      |  Y        |    Y
9 |  jim      |  null     |    Y

My attempt looks like this:

 select *, 
  first_value(somevalue) over (partition by person order by (somevalue is null), ts rows between UNBOUNDED PRECEDING AND current row  ) as carry_forward

 from visits  
 order by ts

Note: the (somevalue is null) evaluates to 1 or 0 for the purposes of sorting so I can get the first non-null value in the partition.

The above doesn't give me the result I'm after.

maxTrialfire
  • 1,194
  • 4
  • 11
  • 23

2 Answers2

21

The following query achieves the desired result:

select *, first_value(somevalue) over w as carryforward_somevalue
from (
  select *, sum(case when somevalue is null then 0 else 1 end) over (partition by person order by id ) as value_partition
  from test1

) as q
window w as (partition by person, value_partition order by id);

Note the null case statement - if IGNORE_NULL was supported by postgres window functions this wouldnt be needed (as mentioned by @ypercubeᵀᴹ)

maxTrialfire
  • 1,194
  • 4
  • 11
  • 23
7

The problem is in the gaps-and-islands category of problems. It's a pity that Postgres has not yet implemented IGNORE NULL in window functions like FIRST_VALUE(), otherwise it would be trivial, with a simple change in your query.

There are probably many ways for this to be solved using window functions or recursive CTEs.

Not sure if it is the most efficient way but a recursive CTE does solve the problem:

with recursive 
    cf as
    (
      ( select distinct on (person) 
            v.*, v.somevalue as carry_forward
        from visits as v
        order by person, ts
      ) 
      union all
        select 
            v.*, coalesce(v.somevalue, cf.carry_forward)
        from cf
          join lateral  
            ( select v.*
              from visits as v
              where v.person = cf.person
                and v.ts > cf.ts
              order by ts
              limit 1
            ) as v
            on true
    )
select cf.*
from cf 
order by ts ;
Michael Green
  • 25,255
  • 13
  • 54
  • 100
ypercubeᵀᴹ
  • 99,450
  • 13
  • 217
  • 306