0

I've recently been working with decently sized tables (between 100s of million of rows to 10s of billions), and have ran into a few cases where I was forced into a corner to use query hints, which I used to think were bad practice but since learned they have their place (mostly the ForceSeek and ForceScan hints).

I've finally ran into a query where no matter what I do, I can't seem to get it to use the better index naturally (which is the primary key clustered index).

The query is pretty simple:

SELECT A.Field1, B.Field2
FROM TableA AS A WITH (FORCESEEK)
INNER JOIN TableB AS B WITH (FORCESEEK)
    ON A.PrimaryKeyClusteredIndexedField = B.PrimaryKeyClusteredIndexedField
WHERE B.NonClusteredIndexedField = SomeBooleanValue -- SomeBooleanValue = 1 (True)

I run this query across many databases (about 500 - this is just how our schema is designed) and in the relatively smaller databases it does default to using the Primary Key Clustered Index in the query plan. In a couple one-off much larger databases it instead tries to do an Index Seek on the Nonclustered Index field "B.NonClusteredIndexedField" which also then requires a Key Lookup.

Not sure why it would pick the Nonclustered Index over the Primary Key Clustered Index in this case, but the Key Lookup then severely impacts the performance of the query. Using an Index Hint on the TableB so that the code becomes INNER JOIN TableB AS B WITH (FORCESEEK, INDEX(IX_PrimaryKeyClusteredIndex)) fixes this problem but feels dirty to me. Is it ok to use in one-off cases like this, especially when it's an Index Hint to the Primary Key Clustered Index which is unlikely to change?

J.D.
  • 40,776
  • 12
  • 62
  • 141

4 Answers4

3

It's OK to do this only when a better solution doesn't exist.

As David Browne - Microsoft commented:

If you were writing in another language you wouldn't think twice about telling the computer not just what to do by how to do it. That's all an optimizer hint is.

Paul White
  • 94,921
  • 30
  • 437
  • 687
Tony Hinkle
  • 8,062
  • 1
  • 24
  • 46
2

On that particular query you could force order or explicitly use loop join. But that case FORCESEEK hint ON TableA has no sense:

SELECT A.Field1, B.Field2
FROM TableA AS A
INNER LOOP JOIN TableB AS B WITH (FORCESEEK)
    ON A.PrimaryKeyClusteredIndexedField = B.PrimaryKeyClusteredIndexedField
WHERE B.NonClusteredIndexedField = SomeBooleanValue

SELECT A.Field1, B.Field2
FROM TableA AS A
INNER JOIN TableB AS B WITH (FORCESEEK)
    ON A.PrimaryKeyClusteredIndexedField = B.PrimaryKeyClusteredIndexedField
WHERE B.NonClusteredIndexedField = SomeBooleanValue
OPTION(FORCE ORDER)

To use a clustered index in the query hint I would suggest use its number instead of its name:

SELECT A.Field1, B.Field2
FROM TableA AS A (FORCESEEK)
INNER LOOP JOIN TableB AS B WITH (FORCESEEK, INDEX(1))
    ON A.PrimaryKeyClusteredIndexedField = B.PrimaryKeyClusteredIndexedField
WHERE B.NonClusteredIndexedField = SomeBooleanValue

I would normally agree that the index name is better if we are talking about any index except the clustered one. I have seen such different names for clustered indexes. It's not rare for a name not to be at all self-documenting. Using index 1 you're at least sure of using the clustered index.

Paul White
  • 94,921
  • 30
  • 437
  • 687
Denis Rubashkin
  • 2,196
  • 8
  • 14
1

This is interesting, I have recently asked a question on dba.stackexchange where suitable indexes were not being used. One of the answers uses a FORCESEEK hint to achieve a more efficient plan which uses an index seek that the optimiser did not manage on its own.

Whilst this is a code change, it still (arguably) retains the readability of the original query but gives the performance of using the index seeks, without having to completely refactor the query.

It seems like for whatever reason, the optimiser doesn't want to use the most efficient plan. In this case, I would be happy to promote to production.

SE1986
  • 2,142
  • 4
  • 30
  • 61
0

For long time I did believe that using Index hints were bad, but I found out how Query Optimizations work and its absolutely correct to use index hints.

Here are few points where Optimizer skips indexes, however it looks like these are hard coded rules which have some exceptions.

  1. OR, whenever a query has OR, optimizer skips indexes with filters, specifically COLUMN NOT NULL. Now Developer might have stored information correctly that NULL will not be part of the result, but optimizer doesn't know this.

  2. OR between columns of different joined tables. Again, if you look at the details of #1, you will get the idea that why optimizer skips index. I used to believe that optimizer will choose index if we perform INNER JOIN, but no, I have seen optimizer skip indexes even with INNER JOIN.

  3. RANGE, BETWEEN, these are two things for which optimizer will not chose an index, will go with full scan with lowest number of rows. Lets assume a simple example, I had a table with millions of rows, I was looking for only 100 rows starting from Key1 to Key2. However, the difference between Key1 and Key2 is only 100 rows, using query like Key between @Key1 and @Key2 should use index, but again Optimizer has no idea, that result is smaller or not. Even after adding TOP 100 optimizer would still not use the index correctly. Forcing SEEK on the Key made it return results faster.

In a simple way, Optimizer never understands any logic, it only takes decision based on statistics, only developer will understand the logic and can use hints correctly. Now we shouldn't blame optimizer, because understanding logic is more complicated than choosing index based on statistics.

Optimizer has higher priority to return correct data, it is developer's responsibility to write correct logic.

So it is absolutely correct to use hints in production.

NA.
  • 111
  • 1
  • 6