14

Among some SQL Server developers, it's a widely held belief that NOT IN is terribly slow, and queries should be rewritten so that they return the same result but do not use the "evil" keywords. (example).

Is there any truth to that?

Is there, for example, some known bug in SQL Server (which version?) that causes queries using NOT IN to have a worse execution plan than an equivalent query that uses

  • a LEFT JOIN combined with a NULL check or
  • (SELECT COUNT(*) ...) = 0 in the WHERE clause ?
Heinzi
  • 3,210
  • 2
  • 32
  • 43

1 Answers1

14

I don't think it has anything to do with being terribly slow; it has to do with being potentially inaccurate. For example, given the following data - orders that could be placed either by an individual customer, or a B2B partner:

DECLARE @Customers TABLE(CustomerID INT);

INSERT @Customers VALUES(1),(2);

DECLARE @Orders TABLE(OrderID INT, CustomerID INT, CompanyID INT);

INSERT @Orders VALUES(10,1,NULL),(11,NULL,5);

Let's say I want to find all of the customers who have never placed an order. Given the data, there's only one: customer #2. Here are three ways I might go about writing a query to find that information (there are others):

SELECT [NOT IN] = CustomerID FROM @Customers 
  WHERE CustomerID NOT IN (SELECT CustomerID FROM @Orders);

SELECT [NOT EXISTS] = CustomerID FROM @Customers AS c 
  WHERE NOT EXISTS (SELECT 1 FROM @Orders AS o
  WHERE o.CustomerID = c.CustomerID);

SELECT [EXCEPT] = CustomerID FROM @Customers
EXCEPT SELECT CustomerID FROM @Orders;

Results:

NOT IN
------
                 -- <-- no results. Is that what you expected?

NOT EXISTS
----------
2

EXCEPT
------
2

Now, there are some performance issues as well, and I talk about them in this blog post. Depending on the data and indexes, NOT EXISTS will usually outperform NOT IN, and I don't know if it could ever perform worse. You should also note that EXCEPT can introduce a distinct sort operation, so you may end up with different data (again, depending on the source). And that the popular LEFT OUTER JOIN ... WHERE right.column IS NULL pattern is always the worst performer.

Martin Smith has a lot of good supporting information in his answer on SO, too.

Aaron Bertrand
  • 181,950
  • 28
  • 405
  • 624