When using a subquery to find the total count of all prior records with a matching field, the performance is terrible on a table with as little as 50k records. Without the subquery, the query executes in a few milliseconds. With the subquery, the execution time is upwards of a minute.
For this query, the result must:
- Include only those records within a given date range.
- Include a count of all prior records, not including the current record, regardless of date range.
Basic Table Schema
Activity
======================
Id int Identifier
Address varchar(25)
ActionDate datetime2
Process varchar(50)
-- 7 other columns
Example Data
Id Address ActionDate (Time part excluded for simplicity)
===========================
99 000 2017-05-30
98 111 2017-05-30
97 000 2017-05-29
96 000 2017-05-28
95 111 2017-05-19
94 222 2017-05-30
Expected Results
For the date range of 2017-05-29 to 2017-05-30
Id Address ActionDate PriorCount
=========================================
99 000 2017-05-30 2 (3 total, 2 prior to ActionDate)
98 111 2017-05-30 1 (2 total, 1 prior to ActionDate)
94 222 2017-05-30 0 (1 total, 0 prior to ActionDate)
97 000 2017-05-29 1 (3 total, 1 prior to ActionDate)
Records 96 and 95 are excluded from the result, but are included in the PriorCount subquery
Current Query
select
*.a
, ( select count(*)
from Activity
where
Activity.Address = a.Address
and Activity.ActionDate < a.ActionDate
) as PriorCount
from Activity a
where a.ActionDate between '2017-05-29' and '2017-05-30'
order by a.ActionDate desc
Current Index
CREATE NONCLUSTERED INDEX [IDX_my_nme] ON [dbo].[Activity]
(
[ActionDate] ASC
)
INCLUDE ([Address]) WITH (
PAD_INDEX = OFF,
STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF,
ONLINE = OFF,
ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON
)
Question
- What strategies could be used to improve the performance of this query?
Edit 1
In reply to the question of what I can modify on the DB: I can modify the indexes, just not the table structure.
Edit 2
I have now added a basic index on the Address column, but that didn't seem to improve much. I am currently finding much better performance with creating a temp table and inserting the values without the PriorCount and then updating each row with their specific counts.
Edit 3
The Index Spool Joe Obbish (accepted answer) found was the issue. Once I added in a new nonclustered index [xyz] on [Activity] (Address) include (ActionDate), the query times went down from upwards of a minute to less than a second without using a temp table (see edit 2).






