0

I'm working with data that has a start datetime and end datetime column for each event which can be a short period of time (Few hours) up to a number of days in duration.

I'm trying to establish a way to check for where the following is true:

  1. the duration between the start and end datetimes is greater than 6 hours

And

  1. this duration takes place between the hours of 09:00 and 17:00 (on any of the dates within the period if it was a long duration e.g. 36hrs)

If the total duration of 6hrs is met but any of those 6hrs falls outwith these hours, then I don't want to count it.

E.g.

01/05/23 09:00 to 01/05/23 16:30 - true

01/05/23 08:30 to 01/05/23 14:30 - false

01/05/23 10:00 to 01/05/23 17:30 - true

01/05/23 11:00 to 02/05/23 01:00 - true

01/05/23 12:00 to 02/05/23 02:30 - false

I have tried setting up a basic case when to check where datediff >= 6 and check that the end time > start time (incase of any data quality issues with dates recorded) but struggling to come up with a way to check for the datediff >= 6 being met within the 09:00 and 17:00 period and tie the two conditions together.

I'm looking to add this to a large data set so looking for something that hopefully won't affect performance too much as I remember doing something before where I listed rows by the hour within a duration but it had a knock on effect on performance due to the vast amount of rows created to do this.

I was wondering if anyone has any ideas on the best way I can achieve this? Thanks in advance.

DynamoRanger
  • 13
  • 1
  • 3

2 Answers2

0

I think the solution below should get you what you need, or pretty close. The easiest way to do this is by joining onto a numbers table, which I've simulated below with a CTE. This allows you to create a row for each 'hour' between the two dates. Once you have that, it's relatively straightforward to count only the rows you're interested in.

CREATE TABLE Events (EventID int, StartDate datetime, EndDate datetime)
GO
INSERT INTO [dbo].[Events]
(
    [EventID],
    [StartDate],
    [EndDate]
)
VALUES 
(1, '20230501 09:00', '20230501 16:30')
,(2, '20230501 08:30', '20230501 14:30')
,(3, '20230501 10:00', '20230501 17:30')
,(4, '20230501 11:00', '20230502 01:00')
,(5, '20230501 12:00', '20230503 02:30');

GO

--Extend as required, or better still, use a numbers table WITH [cteNumbers] ([Number]) AS ( SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 10 UNION SELECT 11 UNION SELECT 12 UNION SELECT 13 UNION SELECT 14 UNION SELECT 15 UNION SELECT 16 UNION SELECT 17 UNION SELECT 18 UNION SELECT 19 UNION SELECT 20 UNION SELECT 21 UNION SELECT 22 UNION SELECT 23 UNION SELECT 24 )

SELECT H.[EventID], H.[StartDate], h.[EndDate], COUNT(*) AS [HoursDuringWindow] FROM ( SELECT [X].[EventID], [X].[StartDate], [X].[EndDate], DATEADD(HOUR, datediff(hour, 0, DATEADD(hh, x.[RowNo]-1, X.[StartDate])), 0) AS RoundedStart, DATEADD(HOUR, 1, DATEADD(HOUR, datediff(hour, 0, DATEADD(hh, x.[RowNo]-1, X.[StartDate])), 0)) AS RoundedEnd FROM ( -- Create a table our all our time periods SELECT [E].[EventID], E.[StartDate], E.[EndDate], ROW_NUMBER() OVER (PARTITION BY [E].[EventID] ORDER BY [E].[StartDate] ) AS [RowNo] FROM [dbo].[Events] AS [E] INNER JOIN [cteNumbers] AS [N] ON [N].[Number] <= DATEDIFF(hh, [E].[StartDate], [E].[EndDate])

                ) AS [X]
    ) AS H

WHERE CAST(H.RoundedStart AS time) >= '09:00' AND CAST(H.RoundedStart AS time) <= '17:00' AND CAST(H.RoundedEnd AS time) <= '17:00' GROUP BY H.[EventID], H.[StartDate], h.[EndDate];

FrugalShaun
  • 441
  • 3
  • 11
0

I think this would be my approach, maybe take out the hardcoded times and put them as variables at the top so the code is more maintainable

SELECT 
x.time_start,
x.time_end,
GREATEST(
    CONVERT(time, '09:00'), 
    CONVERT(time, x.time_start)
    ) as eff_start,  
LEAST   ( 
    CONVERT(time, '17:00'), 
    CONVERT(time, x.time_end)
    ) as eff_end,
DATEDIFF(   MINUTE,
    GREATEST(
        CONVERT(time, '09:00'), 
        CONVERT(time, x.time_start)
        ),  
    LEAST   ( 
        CONVERT(time, '17:00'), 
        CONVERT(time, x.time_end)
        )
    ),
IIF(
    DATEDIFF(   MINUTE,
            GREATEST(
                CONVERT(time, '09:00'), 
                CONVERT(time, x.time_start)
                ),  
            LEAST   ( 
                CONVERT(time, '17:00'), 
                CONVERT(time, x.time_end)
                )
        )
         >= 360
         , 'enough hours'
         , 'not enough hours'
 )
FROM
(
select 
    CONVERT(DATETIME, time_start) as time_start, 
    CONVERT(DATETIME, time_end) as time_end 
FROM 
    (
    VALUES 
    ('20250408 07:34','20250408 08:01'), 
    ('20250408 08:10','20250408 15:15'), 
    ('20250408 09:12','20250408 12:12' ), 
    ('20250408 09:22','20250408 15:45' ), 
    ('20250408 12:20','20250408 19:20' ), 
    ('20250408 17:34','20250408 17:44' ) 
    ) as  Mytimes (time_start, time_end) 
) as x
Stephen Morris - Mo64
  • 4,656
  • 1
  • 10
  • 18