3

I have a data set that appears like this:

next_generation_date procedure interval frequency
2021-01-17 00:00:00.000 Clean Restroom 1 day
2021-01-17 00:00:00.000 Vacuum 2 week
2021-02-01 00:00:00.000 Inspect Fire Extinguisher 3 month
2021-10-01 00:00:00.000 Test Generator 4 year

My goal is to generate multiple date rows for each procedure by going off of the next_generation_date, interval and frequency columns until a specified end date. This specified end date would be the same date throughout the entire table. In this example, let's make the specified end date 2025-12-31. My end goal is for the table to appear similar to the below. Please note since this is an example, I didn't include every row as to avoid having hundreds of rows listed below.

next_generation_date procedure interval frequency
2021-01-17 00:00:00.000 Clean Restroom 1 day
2021-01-18 00:00:00.000 Clean Restroom 1 day
2021-01-19 00:00:00.000 Clean Restroom 1 day
2021-01-20 00:00:00.000 Clean Restroom 1 day
2021-01-17 00:00:00.000 Vacuum 2 week
2021-01-31 00:00:00.000 Vacuum 2 week
2021-02-14 00:00:00.000 Vacuum 2 week
2021-02-28 00:00:00.000 Vacuum 2 week
2021-02-01 00:00:00.000 Inspect Fire Extinguisher 3 month
2021-05-01 00:00:00.000 Inspect Fire Extinguisher 3 month
2021-08-01 00:00:00.000 Inspect Fire Extinguisher 3 month
2021-11-01 00:00:00.000 Inspect Fire Extinguisher 3 month
2021-10-01 00:00:00.000 Test Generator 4 year
2025-10-01 00:00:00.000 Test Generator 4 year

To summarize the above table, future "Clean Restroom" dates are each day, "Vacuum" dates are every two weeks, "Inspect Fire Extinguisher" dates are every three months, and "Test Generator" dates are every four years.

Below are two of my most recent attempts in Microsoft SQL Server Management Studio. The first attempt creates the loop but my procedures only increase by an interval of 1, not the unique interval found within the interval column. My second attempt generates the next next_generation_date with the correct interval but doesn't loop.

CREATE TABLE ##many_integers (idx INT);
WITH units(units) AS (
SELECT 0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION 
SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9
)
,tens             AS(SELECT units *        10 AS tens             FROM units       )
,hundreds         AS(SELECT tens  *        10 AS hundreds         FROM tens        )
,thousands        AS(SELECT hundreds *     10 AS thousands        FROM hundreds    )
,tenthousands     AS(SELECT thousands *    10 AS tenthousands     FROM thousands   )
,hundredthousands AS(SELECT tenthousands * 10 AS hundredthousands FROM tenthousands)                                                                                                                                        
INSERT                                                                                                                                                                                                                      
INTO ##many_integers
SELECT  hundredthousands +tenthousands +thousands +hundreds +tens +units 
FROM       units 
CROSS JOIN tens 
CROSS JOIN hundreds 
CROSS JOIN thousands 
CROSS JOIN tenthousands 
CROSS JOIN hundredthousands ;

SELECT [procedure], [frequency], [interval], CASE [frequency] WHEN 'day' THEN DATEADD( day , idx , [next_generation_date]) WHEN 'week' THEN DATEADD( week , idx , [next_generation_date]) WHEN 'month' THEN DATEADD( month , idx , [next_generation_date]) WHEN 'year' THEN DATEADD( year , idx , [next_generation_date]) ELSE NULL END AS [next_generation_date] FROM [data].[fact] CROSS JOIN ##many_integers WHERE idx < 5000 AND CASE [frequency] WHEN 'day' THEN DATEADD( day , idx , [next_generation_date]) WHEN 'week' THEN DATEADD( week , idx , [next_generation_date]) WHEN 'month' THEN DATEADD( month , idx , [next_generation_date]) WHEN 'year' THEN DATEADD( year , idx , [next_generation_date]) ELSE NULL END <= '2023-12-31 00:00:00' ORDER BY 1 ;

SELECT [procedure], [frequency], [interval], [next_generation_date] FROM [data].[fact]
UNION ALL
SELECT [procedure], [frequency], [interval],
CASE
       WHEN [frequency] = 'day' THEN DATEADD(day, [interval], [next_generation_date])
       WHEN [frequency] = 'week' THEN DATEADD(week, [interval], [next_generation_date])
       WHEN [frequency] = 'month' THEN DATEADD(month, [interval], [next_generation_date])
       WHEN [frequency] = 'year' THEN DATEADD(year, [interval], [next_generation_date])
       ELSE NULL
END AS [next_generation_date]
FROM [data].[fact]
ORDER BY 1;

Any and all suggestions are greatly appreciated as I'm new to SQL. Thank you.

Gulfhawk
  • 31
  • 1
  • 2

1 Answers1

5

I recommend using a Calendar table. Aaron Bertand has a blog series about them

For this code sample, I'll be using my calendar table. It's quite a bit of code so this is a link to my GitHub

For the solution, I'll narrow the range from start to end. To calculate the interval I'll use the helper offset columns.

Current_offset - starting_offset % interval = 0

That's a logic shared for all frequency types. But some frequencies require additional conditions. For example with the WEEK frequency - the offset applies to a whole week, but we want to limit it to the same weekday (e.g. Sunday with Sunday). Similar for the other frequencies.

Complete solution (relies on the existence of the Calendar table)

DROP TABLE IF EXISTS #RegularTask
CREATE TABLE #RegularTask
(
    StartDate date
    , EndDate date 
    , RegularProcedure varchar(50)
    , interval tinyint
    , frequency varchar(10)
)

INSERT INTO #RegularTask (StartDate, EndDate, RegularProcedure, interval, frequency) VALUES (DATEFROMPARTS(2021,10,01), DATEFROMPARTS(2025,12,31), 'Test Generator', 4, 'YEAR') , (DATEFROMPARTS(2021,02,01), DATEFROMPARTS(2025,12,31), 'Inspect Fire Extinguisher', 3, 'MONTH') , (DATEFROMPARTS(2021,01,17), DATEFROMPARTS(2025,12,31), 'Vacuum', 2, 'WEEK') , (DATEFROMPARTS(2021,01,17), DATEFROMPARTS(2025,12,31), 'Clean Restroom', 1, 'DAY')

;WITH startPoint AS ( SELECT rt.StartDate , rt.EndDate , rt.RegularProcedure , rt.interval , rt.frequency , c.YearNum , c.MonthNum , c.DayOfMonthNum , c.DayOfWeekNum , c.DayOffset , c.WeekOffset , c.MonthOffset FROM #RegularTask AS rt JOIN dbo.Calendar AS c ON rt.StartDate = c.DateVal ) SELECT c.DateVal , sp.RegularProcedure , sp.interval , sp.frequency FROM dbo.Calendar AS c JOIN startPoint AS sp ON c.DateVal >= sp.StartDate AND c.DateVal <= sp.EndDate AND ( ( sp.frequency = 'DAY' AND (c.DayOffset - sp.DayOffset) % sp.interval = 0 ) OR ( sp.frequency = 'WEEK' AND (c.WeekOffset - sp.WeekOffset) % sp.interval = 0 AND c.DayOfWeekNum = sp.DayOfWeekNum ) OR ( sp.frequency = 'MONTH' AND (c.MonthOffset - sp.MonthOffset) % sp.interval = 0 AND c.DayOfMonthNum = sp.DayOfMonthNum

    )
    OR 
    (
        sp.frequency = 'YEAR'
        AND (c.YearNum - sp.YearNum) % sp.interval = 0
        AND c.MonthNum = sp.MonthNum
        AND c.DayOfMonthNum = sp.DayOfMonthNum
    )
)

Result sample

Zikato
  • 5,619
  • 1
  • 17
  • 34