How to count repeated rows taking into account other rows in the middle

Question

For SQL server 2008 R2 It seems challenging to come up with a single query to the following: Example given columns a & b:

a |   b
-------------
0 |  2000
1 |  2001
1 |  2002
1 |  2003
2 |  2004
3 |  2005
1 |  2006
1 |  2007
4 |  2008
1 |  2009

Goal: Mark rows with repeated column a and give them unique number taking into account other values in between. Result should be in column c. Note the most difficult part here is to populate column c with 2 & 5 & 7.

a |  b   |  c
-------------
0 |  2000 | 1
1 |  2001 | 2
1 |  2002 | 2
1 |  2003 | 2
2 |  2004 | 3
3 |  2005 | 4
1 |  2006 | 5
1 |  2007 | 5
4 |  2008 | 6
1 |  2009 | 7

ypercubeᵀᴹ · Answer 1 · 2016-11-23T16:56:14.317

This is a gaps-and-islands problem. One (of the many) ways to solve it (this requires 2012+ versions):

WITH 
  t AS
    ( SELECT a, b, x = CASE WHEN a = LAG(a) OVER (ORDER BY b) 
                           THEN NULL ELSE 1 
                       END
      FROM table_name
    )
SELECT a, b, c = COUNT(x) OVER (ORDER BY b) 
FROM t 
ORDER BY b ;

This should work in 2005 and above:

WITH 
  t AS
    ( SELECT a, b, dx = ROW_NUMBER() OVER (ORDER BY b) 
                        - ROW_NUMBER() OVER (PARTITION BY a ORDER BY b) 
      FROM table_name
    ),
  tt AS
    ( SELECT a, b, mb = MIN(b) OVER (PARTITION BY a, dx)
      FROM t 
    )
SELECT a, b, c = DENSE_RANK() OVER (ORDER BY mb)
FROM tt 
ORDER BY b ;

score 1 · Answer 2 · answered Nov 23 '16 at 13:20

Well, a not so elegant way could be (assuming your table is ordered by b and b is unique) :

with CTE
as
    (
    select t1.a, t1.b, isnull(MAX(t2.b), -1) as max_b
    from my_table t1
        left join my_table t2 on t1.b > t2.b and t1.a <> t2.a
    group by t1.a, t1.b
    ),
CTE_2
as
    (
    select max_b, ROW_NUMBER() over(order by max_b) as c
    from (
        select distinct max_b
        from CTE
        ) t
    )


update t
set c = c2.c
from my_table t 
join CTE c1 on t.b = c1.b
left join CTE_2 c2 on c1.max_b = c2.max_b

How to count repeated rows taking into account other rows in the middle

2 Answers2

Linked