Why do I get an index spool on a heap in scalar subquery?

Question

I was reading this And I did the following:

go
create table fiirst (
col1 int,
col2 int
);
create table seecond(
col1 int,
col2 int
);
with
n1(c) as (select 0 union all select 0 ),
n2(c) as ( select 0 from n1 as t1 cross join n1 as t2),
n3(c) as ( select 0 from n2 as t1 cross join n2 as t2),
n4(c) as (select 0 from n3 as t1 cross join n3 as t2),
ids(id) as (select ROW_NUMBER() over (order by (select null)) from n4)
insert into fiirst(col1,col2)
select id,id 
from ids;
with
n1(c) as (select 0 union all select 0 ),
n2(c) as ( select 0 from n1 as t1 cross join n1 as t2),
n3(c) as ( select 0 from n2 as t1 cross join n2 as t2),
n4(c) as (select 0 from n3 as t1 cross join n3 as t2),
ids(id) as (select ROW_NUMBER() over (order by (select null)) from n4)
insert into seecond(col1,col2)
select id,id
from ids;
----Craig Freedman's query
select *
from fiirst
where fiirst.col1 > (
    select min(seecond.col1)
    from seecond
    where seecond.col2 < fiirst.col2
);

And I got an index spool, even though the table is a heap. The question is, how did this happen? Why do I get an index spool on a heap? In the example mentioned in the above link, there was no rows, so no spools, but here I see them?

score 3 · Accepted Answer · answered Apr 13 '25 at 14:08

An eager index spool is SQL Server creating an index during query execution. It's often a sign that the table on the inner side of a nested loops join is missing a helpful index, though sometimes it's an optimizer miscue and can be remedied with a FORCESEEK HINT.

I think your confusion is around the table being a heap - having no indexes - and either not seeing a table spool, or thinking that the type of spool indicates the format of the data source for the spool.

Spools do not define the format of the data source for the spool, but rather the format of the data when it reaches its destination in the spool.

Similarly, data from a clustered or non-clustered index may end up in a table spool. A great undoing!

If you look at the tool tip for the eager index spool, you'll see the basic definition of the index that would replace the need for a spool in the plan, though it's not terribly difficult to figure out just looking at the query itself.

CREATE INDEX 
    iindeexspooool
ON dbo.seecond
    (col2) /*Seek predicate*/
INCLUDE
    (col1); /*Output list*/

In your case, the optimizer has no choice but to use a Nested Loops join because your inner correlation has no equality predicate:

select *
from fiirst
where fiirst.col1 > (
    select min(seecond.col1)
    from seecond
    where seecond.col2 < fiirst.col2 /*Right about here*/
);

In order to use a Hash or Merge join, at least one equality predicate is required.

Why do I get an index spool on a heap in scalar subquery?

1 Answers1