2

I have this select query:

SELECT ITEM.ID ITEM_ID,
       ITEM.CODE ITEM_CODE,
       ITEM.DESCRIPTION ITEM_NAME,
       ITEM.CODE_CATEGORY_ID CATEGORY_ID,
       ITEM_CAT.CODE CATEGORY_CODE,
       ITEM_CAT.NAME CATEGORY_NAME
FROM   B1.SP_ITEM_MF ITEM
       INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT
            ON  ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID
            AND ITEM_CAT.RECORD_TYPE = 'CD'
            AND ITEM_CAT.ACTIVE = 'Y'
       INNER JOIN [A1].SP_REQ_CATEGORY_MAPPING MAP
            ON  MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
       INNER JOIN B1.SP_ADMIN_STRUCTURE REQ_CAT
            ON  REQ_CAT.ID = MAP.REQ_CATEGORY_ID
            AND REQ_CAT.RECORD_TYPE = 'RQ'
            AND REQ_CAT.ACTIVE = 'Y'
            AND REQ_CAT.CODE IN ('CRW', 'SCHF')
ORDER BY
       ITEM.CODE
OFFSET 20 ROWS
     FETCH NEXT 20 ROWS ONLY;

Which is returning 20 records and taking more than 5 minutes to complete execution.

When I remove the FETCH NEXT 20 ROWS ONLY it returns 2000 records and complete execution in just 0 seconds.

Also, if I rewrite the query to insert the result set into a temp table (added below) and then use OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY on that result set that also complete execution in 0 seconds.

CREATE TABLE #TempPagination (
    RowNum INT,
    ITEM_ID BIGINT,
    ITEM_CODE VARCHAR(64),
    ITEM_NAME VARCHAR(256),
    CATEGORY_ID INT,
    CATEGORY_CODE VARCHAR(8),
    CATEGORY_NAME VARCHAR(64)
);

INSERT INTO #TempPagination ( ITEM_ID, ITEM_CODE, ITEM_NAME, CATEGORY_ID, CATEGORY_CODE, CATEGORY_NAME) SELECT

ITEM.ID ITEM_ID,ITEM.CODE  ITEM_CODE ,ITEM.DESCRIPTION ITEM_NAME,ITEM.CODE_CATEGORY_ID CATEGORY_ID,ITEM_CAT.CODE CATEGORY_CODE,ITEM_CAT.NAME CATEGORY_NAME

FROM B1.SP_ITEM_MF ITEM INNER JOIN B1.SP_ADMIN_STRUCTURE ITEM_CAT ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID AND ITEM_CAT.RECORD_TYPE ='CD' and ITEM_CAT.ACTIVE='Y'

INNER JOIN A1.SP_REQ_CATEGORY_MAPPING MAP ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID

INNER JOIN B1.SP_ADMIN_STRUCTURE  REQ_CAT ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID and REQ_CAT.RECORD_TYPE ='RQ' and REQ_CAT.ACTIVE='Y' AND REQ_CAT.CODE IN ('CRW','SCHF');

SELECT ITEM_ID, ITEM_CODE, ITEM_NAME, CATEGORY_ID, CATEGORY_CODE, CATEGORY_NAME FROM #TempPagination order by ITEM_CODE OFFSET 20 ROWS FETCH NEXT 20 ROWS ONLY;

DROP TABLE #TempPagination;

Can someone explain why this happening? I have indexes on the columns and the version I am using is 2019.

  1. Why does the query take more time without the temp table, and not taking more time when inserted into temp table first?

  2. Why is FETCH NEXT 20 ROWS ONLY causing so much performance degradation?


Here are the plans for both queries, I changed the 20 rows to 1 rows

This is a big database, and there are too many indexes for each table (at least 20 for each) to list, that might be added based on different queries. Even if you can't pinpoint the issue, a general understanding of what might be causing the performance issues will be helpful.

17,412,610 & 52,697,525 this is the physical and logical read count on the first query. The rewritten query has 7 & 10,438 respectively.

Paul White
  • 94,921
  • 30
  • 437
  • 687
Subin Benny
  • 77
  • 1
  • 9

2 Answers2

3

Can someone explain why this happening? I have indexes on the columns and the version I am using is 2019.

Why does the query take more time without the temp table, and not taking more time when inserted into temp table first?

The difference in performance is not related to the presence of a temp table or an insert operation, it is due to SQLserver using different query plans for the select with or without the FETCH NEXT 20 ROWS ONLY (which is LIMIT 20 in proper SQL language).

Without FETCH NEXT 20 ROWS ONLY, SQLserver estimates it will have to output 740903 rows and creates a plan which starts by filtering ITEM_CAT for the records that satisfy the filter conditions and then proceeds to seek the ITEMs that belongs to such categories. It estimates that a total of 9900 items will satisfy the conditions, but they are actually 2553. Then it proceeds to match the MAP and REQ_CAT tables to those items and outputs those (either to screen or into a temporay table). It estimates that the total number of records, due to the joins, will be 740903, but it finds that they are still just 2553.

If you insert those 2553 records in a temporary table, taking the next 20 after the first 20 is very fast, even if it has to sort them by ITEM.CODE.

Why is FETCH NEXT 20 ROWS ONLY causing so much performance degradation?

Because now SQLServer knows that it will have to output just 40 records (first 20 to discard and next 20 to output) and estimates that it will be faster to just scan the items in ITEM.CODE order and get the first 40 that satisfy the conditions. It expects to find them after checking about 62 rows, but unfortunatley, is has to check 2945530 ITEMs, looking up the categories for each one, before finding 40 that satisfy the filtering conditions.

So, the problem is that SQLserver is wrongly estimating the selectiveness of the filtering conditions. This could be due to incorrect statistics on the tables and indexes, but could also be due to correlations in your data that cannot be gathered with statistics (for example, all the items in the required categories have high codes and come last in code order).

Andrea B.
  • 1,731
  • 8
  • 13
-3

First of all, do not use the join predicate to filter. Filtering is the RESTRICTION operation in the relational algebrae and is not a JOIN...

Second use the keyword AS to separatee objetx names and aliasses, this make the code much more easy to read and avoid some trouble in execution...

Third avoid using SQL keword for SQL identifier like columsn name (DESCRIPTION, NAME...)

Fourth indent your code to be more readable...

Your query corectly indented :

SELECT ITEM.ID AS ITEM_ID,
       ITEM.CODE AS ITEM_CODE , 
       ITEM.DESCRIPTION AS ITEM_NAME,
       ITEM.CODE_CATEGORY_ID AS CATEGORY_ID,
       ITEM_CAT.CODE AS CATEGORY_CODE,
       ITEM_CAT.NAME CATEGORY_NAME
FROM   B1.SP_ITEM_MF AS ITEM
       INNER JOIN B1.SP_ADMIN_STRUCTURE AS ITEM_CAT 
          ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID 
       AND ITEM_CAT.RECORD_TYPE ='CD' and ITEM_CAT.ACTIVE='Y'
       INNER JOIN [A1].SP_REQ_CATEGORY_MAPPING MAP 
          ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
       INNER JOIN B1.SP_ADMIN_STRUCTURE  REQ_CAT 
          ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID 
       and REQ_CAT.RECORD_TYPE ='RQ' and REQ_CAT.ACTIVE='Y' AND REQ_CAT.CODE IN ('CRW','SCHF')
ORDER  BY ITEM.CODE 
       OFFSET 20 ROWS 
       FETCH NEXT 20 ROWS ONLY;

Your query properly writted :

SELECT ITEM.ID AS ITEM_ID,
       ITEM.CODE AS ITEM_CODE , 
       ITEM."DESCRIPTION" AS ITEM_NAME,
       ITEM.CODE_CATEGORY_ID AS CATEGORY_ID,
       ITEM_CAT.CODE AS CATEGORY_CODE,
       ITEM_CAT."NAME" AS CATEGORY_NAME
FROM   B1.SP_ITEM_MF AS ITEM
       INNER JOIN B1.SP_ADMIN_STRUCTURE AS ITEM_CAT 
          ON ITEM.CODE_CATEGORY_ID = ITEM_CAT.ID 
       INNER JOIN A1.SP_REQ_CATEGORY_MAPPING AS MAP 
          ON MAP.ITEM_CATEGORY_ID = ITEM.CODE_CATEGORY_ID
       INNER JOIN B1.SP_ADMIN_STRUCTURE AS REQ_CAT 
          ON REQ_CAT.ID = MAP.REQ_CATEGORY_ID 
WHERE  ITEM_CAT.RECORD_TYPE ='CD' and 
       ITEM_CAT.ACTIVE='Y' and 
       REQ_CAT.RECORD_TYPE ='RQ' and 
       REQ_CAT.ACTIVE='Y' AND 
       REQ_CAT.CODE IN ('CRW','SCHF')
ORDER  BY ITEM.CODE 
       OFFSET 20 ROWS 
       FETCH NEXT 20 ROWS ONLY;

Now I should think that the best indexes to have performances for your query will be :

CREATE INDEX X001 ON B1.SP_ITEM_MF (CODE_CATEGORY_ID, CODE) INCLUDE (ITEM_ID, "DESCRIPTION")
CREATE INDEX X002 ON B1.SP_ADMIN_STRUCTURE (RECORD_TYPE, ACTIVE, ID) INCLUDE (CODE, "NAME")
CREATE INDEX X003 ON A1.SP_REQ_CATEGORY_MAPPING (ITEM_CATEGORY_ID, REQ_CATEGORY_ID)
CREATE INDEX X004 ON B1.SP_ADMIN_STRUCTURE (RECORD_TYPE, CODE, ACTIVE, ID)

So try it !

SQLpro
  • 550
  • 2
  • 8