6

I have two datetime columns in SQL Server 2005 that I need to query without the time portion of the datetime. Currently, my query looks something resembling this (just an example):

WITH Dates AS ( 
        SELECT [Date] = @StartDate
        UNION ALL SELECT [Date] = DATEADD(DAY, 1, [Date])
        FROM Dates WHERE [Date] < @EndDate
) 
SELECT DISTINCT ID
FROM table t
CROSS APPLY DATES d
WHERE d.[Date] BETWEEN CONVERT(DATETIME, CONVERT(VARCHAR, t.StartDate, 103)) AND CONVERT(DATETIME, CONVERT(VARCHAR, t.EndDate, 103))

Now this results in a full Clustered Index Scan (surprise, surprise). I am trying to think of ways of making this faster (the actual query takes 3 mins :O). I have thought of doing the following but haven't tried any yet as I ran out of time earlier:

  1. Use a computed column with just the date part
  2. Index said computed column (Not sure if this is even possible?)
  3. Use an Indexed view (Again, not sure if this is possible, will work?)

Easiest way would be to update the column and remove all time information, but I cannot do this :(

Any ideas?

Update

Thanks for all of the answers so far. I think the point of the question was missed because I was slightly unclear on what I wanted. My bad. I was only aiming to optimise the date conversion part of the query as the amount of data i'm dealing with is tiny in reality ( < 500,000 after the cross apply with a year long date range). Sorry for the confusion on this.

For those optimising the rest of the query for me, I can see what people are saying by using < > but consider the following:

  1. The parameters that are passed in is a date range (eg. 1st to the end of this month)
  2. The start date in the table can appear before or during the parameters date range (eg. only the end date is in the date range)
  3. The end date in the table can appear during or after the parameters date range (eg. only the start date is in the date range)
  4. Lastly, the start and end date in the table is in the parameters date range.

Personally, given the above, I could never get a < > solution to work. The only way I could get it to work properly and not miss anything is by using a CTE and saying where d.[Date] BETWEEN t.StartDate AND t.EndDate.

I hope this clears things up. Thanks again.

Stuart Blackler
  • 4,540
  • 7
  • 30
  • 43

5 Answers5

5

You can do steps 1 and 2: but use the DATEADD/DATEDIFF technique as per this: How to the get current date without the time part

You will most likely be unable to index the computed column because it won't be deterministic with the varchar method

gbn
  • 70,237
  • 8
  • 167
  • 244
4

You should be able to use computed columns which you can then index. These should be deterministic as you are going off of a static value. It may depend on how you get the value into the computed column as to if it'll work.

mrdenny
  • 27,106
  • 2
  • 44
  • 81
3

I may be wrong but it looks like this has this been overcomplicated/misunderstood or just plain muddled by the CTE in your original question. From the comments you've added to various answers it appears that:

  • You have a table which includes two DateTime typed columns, StartDate and EndDate. These include time values i.e. the time portion isn't fixed to a known value such as '00:00:00'.
  • You want to find the count of records in your table which have StartDate and EndDate values that are within the range defined by the parameters @StartDate and @EndDate

If I've missed the point, you can hopefully at least use the following script for creating some test data :)

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[MyTable]') AND type in (N'U'))
DROP TABLE [dbo].[MyTable]
GO

CREATE TABLE dbo.MyTable
(
    [id] INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
    , [StartDate] DATETIME
    , [EndDate] DATETIME
)
GO

CREATE INDEX IX_MyTable_StartDate_EndDate ON dbo.MyTable ([StartDate] ASC, [EndDate] ASC)
GO

INSERT dbo.MyTable 
(
    [StartDate]
    , [EndDate]
)
VALUES
(
    DATEADD(MI, (ABS(CHECKSUM(NEWID())) % 1339), DATEADD(DAY, -(ABS(CHECKSUM(NEWID())) % 500), GETDATE()))
    , DATEADD(MI, (ABS(CHECKSUM(NEWID())) % 1339), DATEADD(DAY, (ABS(CHECKSUM(NEWID())) % 500), GETDATE()))
)
GO 10000

Isn't this therefore a simple case of:

DECLARE 
    @StartDate CHAR(8), @EndDate CHAR(8) -- Date only values passed to procedure
    , @StartDateTime DATETIME, @EndDateTime DATETIME -- Internal DateTime values

SELECT 
    @StartDate = '20110101'
    , @EndDate = '20110831'

SELECT
    @StartDateTime = CAST(@StartDate AS DATETIME) -- '2011-01-01 00:00:00'
    , @EndDateTime = DATEADD(DAY, 1, CAST(@EndDate AS DATETIME)) -- '2011-09-01 00:00:00'

SELECT
    COUNT([id])
FROM
    dbo.MyTable
WHERE
    StartDate >= @StartDateTime
AND
    EndDate < @EndDateTime 

EDIT: I missed an obvious optimisation in the above query

SELECT
    COUNT([id])
FROM
    dbo.MyTable
WHERE
    StartDate BETWEEN @StartDateTime AND @EndDatetime
AND
    EndDate < @EndDateTime
Mark Storey-Smith
  • 31,860
  • 9
  • 90
  • 125
2

As I mentioned, I would ultimately try to get rid of the cross apply.

From what I can gather, you're going to end up with rows from the the "table" table that have a begin date and an end date that are before/after (respectively) some row in the Dates table. Because of your Cross Apply, you will be multiplying the "table" results by the number of corresponding rows in the "Dates" table. Then, because of the Distinct, you will be merging that multiplication down to a single row. (That's where I see the inefficiency.)

Why not just do this:

DECLARE @MinDate AS DATETIME
DECLARE @MaxDate AS DATETIME

SELECT 
    @MinDate = MIN(d.StartDate), 
    @MaxDate = MAX(d.EndDate)
FROM Dates d

SELECT DISTINCT ID
FROM table t
WHERE 
    DATEADD(day, DATEDIFF(day, 0, t.StartDate), 0) < @MaxDate OR
    DATEADD(day, DATEDIFF(day, 0, t.EndDate), 0) > @MinDate 

Caveat 1:

This will not work if there are dates in the "table" table that are between the minimum and maximum dates but don't surround a date in the Dates table. So, whether this works is based completely on how your CTE is built. (Since I don't have the full source, I have to presume it's built off the "table" table.

Example:

If "Table" has a row with a StartDate/EndDate of August 18/19 (respectively), but the CTE is built so that there is neither August 18th nor August 19th in the Dates resultset, then this won't work.

Caveat 2:

I'm not positive that those > and < comparisons are correct. I need sample data to validate that.

Richard
  • 1
  • 8
  • 42
  • 62
-1

I feel your pain. I wrote the following for that kind of task, I use it regularly

DECLARE @StartDate DATETIME;
DECLARE @EndDate DATETIME; -- these should be typed, no hour

SET @StartDate = '2011-01-01';
SET @EndDate = '2011-01-31';

WITH TALLY AS -- GENERATE AN ON THE FLY TALLY TABLE WITH REQUIRED AMOUNT OF ROWS
(
    SELECT TOP (DATEDIFF(DD,@StartDate,@EndDate)+1) ROW_NUMBER() OVER (ORDER BY (SELECT 1))-1 N FROM sys.objects A
),   DATES AS -- GENERATE ALL D, D+1 POSSIBILITIES, YOU CAN ADJUST TO ADD 23:59:59.999 INSTEAD OF A WHOLE DAY
(
    SELECT DATEADD(DD,N,@StartDate) StartDate,DATEADD(DD,N+1,@EndDate) EndDate FROM TALLY
)
SELECT DISTINCT ID
FROM table t
INNER JOIN DATES d ON t.YourDateWithHours BETWEEN d.StartDate AND d.EndDate -- THIS IS SARGABLE AND DETERMINISTIC, INDEXES WILL BE USED