3

I have a table with the following schema:

AccountId: int
StartDateTime: datetime2(0)
EndDateTime: datetime2(0)

I want to be able to query this for rows with a certain AccountId, and whose date range intersects with a specified date range. E.g:

SELECT * 
FROM ... AS d
WHERE d.AccountId = @AccountId
    AND d.StartDateTime <= @EndDateTime
    AND d.EndDateTime >= @StartDateTime

This will be a very large table and I'll want to be able to query it as efficiently as possible. I've specified a single composite index of AccountId, StartDateTime, and EndDateTime.

Will this index, and the above query work effectively together? If not, what would be a better alternative? Thanks.

Barguast
  • 341
  • 3
  • 6

2 Answers2

2

If the intervals within a particular account do not overlap with each other, you can replace your query with a TOP (1) query, to find the last start date before your desired end date. Do the same but opposite to get your earliest end date after your desired start date. These will form a small range you can use to check.

If they do overlap within a particular account, then you have one of the classically hard problems, and I'd recommend you read the material that Dejan Sarka has written at http://solidqblogs.azurewebsites.net/en/businessanalytics/interval-queries-in-sql-server-part-1/ (and its follow ups).

The problem is that an indexes on either startdate or enddate are less than ideal, because either one alone doesn't give a clue about whether the interval reaches your desired range. Another strategy is useful instead, such as indexing values within the range (lots of storage, and you have to consider the granularity), or come up with a creative solution such as interval trees or spatial analytics.

Rob Farley
  • 16,324
  • 2
  • 39
  • 61
0

Generally, as previous comments indicate this index looks to be the correct one.

Few notes -

  • You may want to consider using a clustered index, assuming those are the typical queries you'll be running on the table
  • If the planned results are very large (say, 1m rows for an
    account/date-range) consider applying some additional filtering (for example, top x,paging etc) as even if the query/index are optimal -
    returning huge chunk of data takes its own time & resources.
0xAD
  • 21
  • 3