3

I have a table with roughly 11 million rows, defined as:

CREATE TABLE [sko].[stage_närvaro]( 
    [datum_fakta] [datetime] NULL, 
    [person_id] nvarchar NULL, 
    [läsår_fakta] nvarchar NULL, 
    [termin_fakta] nvarchar NULL, 
    [period_fakta] nvarchar NULL, 
    [vecka_fakta] nvarchar NULL, 
    [veckodag_fakta] nvarchar NULL, 
    [ämne_id] nvarchar NULL, 
    [ämne] nvarchar NULL, 
    [frånvaro_min] [float] NULL,
    [närvaro_min] [float] NULL, 
    [frånvaroorsak_id] nvarchar NULL,
    [frånvaroorsak] nvarchar NULL, 
    [beskrivning] nvarchar NULL, 
    [personal_id] nvarchar NULL, 
    [försystem] nvarchar NULL 
)

With the following non-clustered index:

CREATE NONCLUSTERED INDEX [stage_skola_närvaro_ix1] ON [sko].[stage_närvaro] 
(
    [person_id] ASC,
    [termin_fakta] ASC,
    [läsår_fakta] ASC
)

When I run the following delete query, it takes atleast 2+ hours to complete.

DELETE  sko.stage_närvaro
FROM    sko.stage_närvaro e
WHERE   försystem = 'Extens'
AND EXISTS (
    SELECT  *
    FROM    ext.v_imp_närvaro v
    WHERE   e.person_id = v.person_id
    AND     e.termin_fakta = v.termin_fakta
    AND     e.läsår_fakta = v.läsår_fakta
)

Is my delete-query using my index? Would it help to disable the index before deleting, and enable it afterwards?

Edit: The view ext.v_imp_närvaro has the same amount of rows as the table sko.stage_närvaro.

Edit2: I suspected that it was an I/O issue, so I ran the following query as suggested by DaniSQL here:

SELECT TOP 10
        wait_type ,
        max_wait_time_ms wait_time_ms ,
        signal_wait_time_ms ,
        wait_time_ms - signal_wait_time_ms AS resource_wait_time_ms ,
        100.0 * wait_time_ms / SUM(wait_time_ms) OVER ( ) AS percent_total_waits ,
        100.0 * signal_wait_time_ms / SUM(signal_wait_time_ms) OVER ( ) AS percent_total_signal_waits ,
        100.0 * ( wait_time_ms - signal_wait_time_ms )
        / SUM(wait_time_ms) OVER ( ) AS percent_total_resource_waits
FROM    sys.dm_os_wait_stats
WHERE   wait_time_ms > 0 -- remove zero wait_time
        AND wait_type NOT IN -- filter out additional irrelevant waits
( 'SLEEP_TASK', 'BROKER_TASK_STOP', 'BROKER_TO_FLUSH', 'SQLTRACE_BUFFER_FLUSH',
  'CLR_AUTO_EVENT', 'CLR_MANUAL_EVENT', 'LAZYWRITER_SLEEP', 'SLEEP_SYSTEMTASK',
  'SLEEP_BPOOL_FLUSH', 'BROKER_EVENTHANDLER', 'XE_DISPATCHER_WAIT',
  'FT_IFTSHC_MUTEX', 'CHECKPOINT_QUEUE', 'FT_IFTS_SCHEDULER_IDLE_WAIT',
  'BROKER_TRANSMITTER', 'FT_IFTSHC_MUTEX', 'KSOURCE_WAKEUP',
  'LAZYWRITER_SLEEP', 'LOGMGR_QUEUE', 'ONDEMAND_TASK_QUEUE',
  'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BAD_PAGE_PROCESS',
  'DBMIRROR_EVENTS_QUEUE', 'BROKER_RECEIVE_WAITFOR',
  'PREEMPTIVE_OS_GETPROCADDRESS', 'PREEMPTIVE_OS_AUTHENTICATIONOPS', 'WAITFOR',
  'DISPATCHER_QUEUE_SEMAPHORE', 'XE_DISPATCHER_JOIN', 'RESOURCE_QUEUE' )
ORDER BY wait_time_ms DESC

With the following result. I'm not sure how to interpret them though.

result

Marcus
  • 135
  • 6

3 Answers3

5

I don't know fully details of your environment, i.e. if the tables in question are mostly used for writing or reading.

How often you do this delete?

what is the primary key and clustered index of [sko].[stage_närvaro]?

If I wanted to optimise this delete there are a few things I would consider:

1) an index on the underlying tables of the view ext.v_imp_närvaro with the columns used in the select (person_id, termin_fakta,[läsår_fakta]) you want an index seek there most likely (no need to include any columns because you are just going there for the EXISTS)

2) I have been using a lot filtered indexes and I would consider the following:

CREATE NONCLUSTERED INDEX IDXF_STAGE_NARVARO_FORSYSTEMS_EXTENS
ON [sko].[stage_närvaro] ([försystem])
INCLUDE ([person_id],[termin_fakta],[läsår_fakta])
WHERE [försystem] = 'Extens'

3) I can see some LCK_M_S going on there, not sure if related to this query in particular but nevertheless I try to always use ROWCOUNT and do big deletes and updates in batches, something like the following:

    USE DATABASENAME
    GO

    DECLARE @RC INT
    SELECT @RC = 0
    SET ROWCOUNT  5000

    WHILE (1 = 1)
      BEGIN
        BEGIN TRANSACTION

                        DELETE  sko.stage_närvaro
                        FROM    sko.stage_närvaro e
                        WHERE   försystem = 'Extens'
                        AND EXISTS (
                            SELECT  *
                            FROM    ext.v_imp_närvaro v
                            WHERE   e.person_id = v.person_id
                            AND     e.termin_fakta = v.termin_fakta
                            AND     e.läsår_fakta = v.läsår_fakta
                        )

                     SELECT @RC = @@ROWCOUNT

                     print CAST ( DB_NAME()  AS VARCHAR(500) ) + 
' -- ' + CAST ( @RC  AS VARCHAR(10) ) + ' -->  ' + 
CAST( GETDATE() AS VARCHAR(25))

                     WAITFOR DELAY '00:00:01';

        IF @RC = 0
          BEGIN
            COMMIT TRANSACTION

            BREAK
          END



        COMMIT TRANSACTION
      END

    SET ROWCOUNT  0




 --==============================================================
 --SET ROWCOUNT 10000 -- define maximum updated rows at once

-- DO THE UPDATE

-- don't forget about bellow 
-- after everything is updated
--SET ROWCOUNT 0

-- Setting ROWCOUNT to 0 turn off limits - don't forget about it.
--===============================================================

This may not be a comprehensive solution, because there are bits missing on the question too, however, it will surely give you some ideas as possible ways to improve big delete operations.

Marcello Miorelli
  • 17,274
  • 53
  • 180
  • 320
2

It seems that you created view ext.v_imp_närvaro on the base table sko.stage_närvaro, if it is right then it will be good choice to create a temp table with qualifier rows of where clause and then execute delete statement on table and use temp table for check existent of rows..

SELECT person_id,termin_fakta,läsår_fakta INTO #Temp
FROM  [ext].[v_imp_närvaro]

CREATE NONCLUSTERED INDEX ON #Temp (person_id,termin_fakta,läsår_fakta)

DELETE  sko.stage_närvaro
FROM    sko.stage_närvaro e
WHERE   försystem = 'Extens'
AND EXISTS (
    SELECT  *
    FROM    #Temp v
    WHERE   e.person_id = v.person_id
    AND     e.termin_fakta = v.termin_fakta
    AND     e.läsår_fakta = v.läsår_fakta
)
Marcus
  • 135
  • 6
Tim
  • 36
  • 1
0

Delete does not always use the same indexes as a select but optimize the select will typically help

try this

select count(*) 
FROM   sko.stage_närvaro e
JOIN   ext.v_imp_närvaro v
  ON   e.person_id    = v.person_id
 AND   e.termin_fakta = v.termin_fakta
 AND   e.läsår_fakta  = v.läsår_fakta
 AND   e.försystem = 'Extens'

an index on försystem should help

This index on the view should help
person_id, termin_fakta, läsår_fakta
Do you need to use the view?
Try going straight to the tables.
You may be doing stuff in the view that you don't really need for the delete.

Optimize the select then try it on the delete

But the index on [sko].[stage_närvaro] may help the select but hurt the delete. Index adds overhead to the delete.

paparazzo
  • 5,048
  • 1
  • 19
  • 32