4

I have a table with some accounts, and their subscription's start and end dates. However, these subscriptions overlap sometimes and I need the start and end dates of every connected subscription period. Like in the example image.

I tried to merge the subscription periods with a date reference table and marked the dates if there was a subscription. However, the code was getting quite complex. I guess there must be a simpler solution.

enter image description here

IF OBJECT_ID('tempdb..#Subscriptions') IS NOT NULL DROP TABLE #Subscriptions
CREATE TABLE #Subscriptions (
    account_id varchar(1)
    ,start_date date
    ,end_date date
)

INSERT INTO #Subscriptions (account_id, start_date, end_date) values ('A','2019-06-20','2019-06-29'), ('A','2019-06-25','2019-07-25'), ('A','2019-07-20','2019-08-26'), ('A','2019-12-25','2020-01-25'), ('A','2021-04-27','2021-07-27'), ('A','2021-06-25','2021-07-14'), ('A','2021-07-10','2021-08-14'), ('A','2021-09-10','2021-11-12'), ('B','2019-07-13','2020-07-14'), ('B','2019-06-25','2019-08-26')

Paul White
  • 94,921
  • 30
  • 437
  • 687
9001_db
  • 43
  • 1
  • 6

2 Answers2

8

Directly:

WITH
cte1 AS (
    SELECT account_id, start_date the_date, 1 weight
    FROM Subscriptions
    UNION ALL
    SELECT account_id, end_date, -1
    FROM Subscriptions
),
cte2 AS (
    SELECT account_id, 
           the_date, 
           SUM(weight) OVER (PARTITION BY account_id 
                             ORDER BY the_date, weight DESC) weight
    FROM cte1
),
cte3 AS (
    SELECT account_id, 
           the_date,
           SUM(CASE WHEN weight = 0 
                    THEN 1
                    ELSE 0
                    END) OVER (PARTITION BY account_id 
                               ORDER BY the_date DESC) group_no
    FROM cte2
)
SELECT account_id, 
       MIN(the_date) start_date,
       MAX(the_date) end_date
FROM cte3
GROUP BY account_id, group_no
ORDER BY 1,2

https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=82f72d57e6c527c3ce59d166c675e48c

Akina
  • 20,750
  • 2
  • 20
  • 22
-2

Just assign value 1 to each row where the start is greater than the previous end (ie. there is a gap in dates) then you can sum the values in order to give each range a group id and get min and max of each range. Simple!

with date_ranges as
(
   select 'one' as partition_columns, cast('2023-01-01' as date) as date_start, cast('2023-01-30' as date) as date_end
   union all
   select 'one' as partition_columns, cast('2023-01-15' as date) as date_start, cast('2023-02-20' as date) as date_end
   union all
   select 'two' as partition_columns, cast('2023-02-15' as date) as date_start, cast('2023-02-28' as date) as date_end
   union all
   select 'two' as partition_columns, cast('2023-03-01' as date) as date_start, cast('2023-03-10' as date) as date_end
   union all
   select 'two' as partition_columns, cast('2023-03-10' as date) as date_start, cast('2023-03-25' as date) as date_end
)

select partition_columns, min(date_start) as date_start, max(date_end) as date_end from ( select partition_columns, date_start, date_end, sum(start_flag) over (partition by partition_columns order by date_start) range_group from ( select partition_columns, date_start, date_end, case when lag(date_end,1) over (partition by partition_columns order by date_start) >= date_start then 0 else 1 end as start_flag from date_ranges ) c ) g group by partition_columns, range_group order by 1, 2, 3

Glenn
  • 1