2

I load data into a data warehouse.

Presently I have a script that I use to drop foreign key constraints and indexes prior to loading the data, for convenience and speed. There is a big window in which I can do the load so I don't need to worry about users accessing the data during the load, but I don't want to impact unrelated data in other tables in the database.

I have done some research here and elsewhere to come up with this script but I wonder if there are some things that I might be overlooking which could either make performance sub-optimal or if I might be missing something important (I don't know ... calculated columns or something?) or maybe I'm doing things in the wrong order etc.

Any advice appreciated to make this robust and performant.

Disable Constraints and Indexes

Edit: I removed the WHILE loop which commenters helped me realise was redundant.

Declare @schema varchar(128) = 'dbo';
Declare @sql nvarchar(max) = N'';

-- 1. Indices
-- Select a list of indexes in the schema and generate statements to disable them.
Select @sql = @sql + 'ALTER INDEX ' + QuoteName(idx.name) + ' ON ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' DISABLE;' + CHAR(13)
  From sys.indexes As idx
  Join sys.objects As obj On idx.object_id = obj.object_id
  Where ((obj.type = 'U' And idx.type in (2,6)) -- Non-clustered index/columnstore on a table
    Or obj.type = 'V') -- All indexes on indexed views
    And obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
  Order By obj.name, idx.name;

Execute sp_executesql @sql;


-- 2. Foreign-key constraints
-- Build a list of foreign keys constraints in the schema and generate statements to disable the constraint checking.
Select @sql = @sql + 'ALTER TABLE ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' NOCHECK CONSTRAINT ' + QuoteName(fk.name) + ';' + CHAR(13)
  From sys.foreign_keys As fk
  Join sys.objects As obj On fk.parent_object_id = obj.object_id  
  Where obj.schema_id = (Select schema_id From sys.schemas Where name = @schema);

Execute sp_executesql @sql;

Enable constraints, rebuild indexes and update statistics

Declare @schema nvarchar(128) = 'dbo';
Declare @sql nvarchar(max) = N'';

-- 1. Indices
-- Build a list of tables in the schema and generate statements to enable the indices on them.
Select @sql = @sql + 'ALTER INDEX ' + QuoteName(idx.name) + ' ON ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' REBUILD' + iif(idx.type = 6, ' WITH (MAXDOP = 1);', ' WITH (FILLFACTOR = 100);') + CHAR(13)
  From sys.indexes idx
  Join sys.objects obj ON obj.object_id = idx.object_id
  Where ((obj.type = 'U' And idx.type in (2,6)) -- Non-clustered index on a table
    Or obj.type = 'V') -- All indexes on indexed views
    And obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
    And idx.is_disabled = 1 -- Don't rebuild indexes that are already online
    And idx.is_hypothetical = 0 -- Don't rebuild hypothetical indexes!
  Order By iif(idx.type = 6, 1, 2), obj.name, idx.name;

Execute sp_executesql @sql;


-- 2. Foreign-key constraints
-- Build a list of foreign keys constraints in the schema and generate statements to enable them with checking.
Select @sql = @sql + 'ALTER TABLE ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' WITH CHECK CHECK CONSTRAINT ' + QuoteName(fk.name) + ';' + CHAR(13)
  From sys.foreign_keys fk
  Join sys.objects obj ON obj.object_id = fk.parent_object_id
  Where obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
  Order By obj.name, fk.name;

Execute sp_executesql @sql;


-- 3. Statistics
-- Build a list of tables in the schema and generate statements to update the statistics on them.
Select @sql = @sql + 'UPDATE STATISTICS ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' WITH COLUMNS;' + CHAR(13)
  From sys.objects obj
  Where obj.type = 'U' -- User defined
    AND obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
  Order By obj.name;

Execute sp_executesql @sql;
mendosi
  • 2,097
  • 12
  • 22

4 Answers4

5

For a few things that you may have not considered:

  1. If an index starts as disabled you'll end up rebuilding it after your script finishes.
  2. If an index starts with page compression you'll rebuild it without compression.

For performance:

  1. Since no other process is accessing the table you could rebuild with the ONLINE = OFF option for possibly a small performance boost due to less locking.
  2. Depending on the size of tempdb and your indexes you could use the SORT_IN_TEMPDB = ON option for a performance boost.
  3. You're updating statistics on all tables, even if that table hasn't been modified or had very few rows changed. You could try to be more restrictive in your updates and only look at tables that were modified since the last stats updated or were modified at all.
  4. It looks like you're running all of these queries in a single session. Have you considered splitting up the work among multiple sessions?
Joe Obbish
  • 32,976
  • 4
  • 74
  • 153
4

Disabling Foreign keys before loading the data is a bit worrying in your case because when you are enabling them, your scrip does not have any error handing .. TRY .. CATCH blocks.

After the data is loaded, how are you checking referential integrity of your database ? Simply enabling foreign keys does not guarantee referential integrity of your database. You should atleast run DBCC CHECKCONSTRAINTS and make sure it runs clean. Refer : Foreign Keys and their States

Another point that I see is double work - What is the point in updating stats after rebuilding the indexes ?

Remember that rebuilding index will update stats on the columns associated with that index.

There is a big window in which I can do the load so I don't need to worry about users accessing the data during the load, but I don't want to impact unrelated data in other tables in the database.

I highly suggest you to read various techniques mentioned in The Data Loading Performance Guide and here in my answer.

Kin Shah
  • 62,545
  • 6
  • 124
  • 245
2

Watch out for hypothetical indexes. If they get left over by a tuning tool because it crashed and did not remove them, then your script will make them real and now you will have 30 or 40 indexes on a single table and performance on that table will be full of blocking and deadlocks.

SELECT dbid = DB_ID(),
       objectid = object_id,
       indid = index_id
FROM sys.indexes
WHERE is_hypothetical = 1 ;
Duane Lawrence
  • 563
  • 2
  • 11
1

Here is the final script that I have, after integrating the feedback from various respondents (just wanted to collect it all together for future readers):

Disable Constraints and Indexes

Declare @schema varchar(128) = 'dbo';
Declare @sql nvarchar(max) = '';

-- 1. Indices
-- Select a list of indexes in the schema and generate statements to disable them.
Select @sql = @sql + 'ALTER INDEX ' + QuoteName(idx.name) + ' ON ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' DISABLE;' + CHAR(13)
  From sys.indexes As idx
  Join sys.objects As obj On idx.object_id = obj.object_id
  Where ((obj.type = 'U' And idx.type in (2,6)) -- Non-clustered index/columnstore on a table
    Or obj.type = 'V') -- All indexes on indexed views
    And obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
  Order By obj.name, idx.name;

Execute sp_executesql @sql;


-- 2. Foreign-key constraints
-- Build a list of foreign keys constraints in the schema and generate statements to disable the constraint checking.
Select @sql = @sql + 'ALTER TABLE ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' NOCHECK CONSTRAINT ' + QuoteName(fk.name) + ';' + CHAR(13)
  From sys.foreign_keys As fk
  Join sys.objects As obj On fk.parent_object_id = obj.object_id  
  Where obj.schema_id = (Select schema_id From sys.schemas Where name = @schema);

Execute sp_executesql @sql;

** Rebuild Indexes, Enable Constraints and Update Statistics**

Declare @schema varchar(128) = 'dbo';
Declare @sql nvarchar(max) = '';

-- 1. Indices
-- Build a list of tables in the schema and generate statements to enable the indices on them.
Select @sql = @sql + 'ALTER INDEX ' + QuoteName(idx.name) + ' ON ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + 
    ' REBUILD WITH (' + iif(idx.type = 6, 
                            'MAXDOP = 1', 
                            'FILLFACTOR = 100, DATA_COMPRESSION = ' + IsNull(p.data_compression_desc Collate Latin1_General_CI_AS, 'None') + ', ONLINE = OFF, SORT_IN_TEMPDB = ON') 
                    + ');' + CHAR(13)
  From sys.indexes idx
  Join sys.objects obj ON obj.object_id = idx.object_id
  Left Join sys.partitions As p On idx.object_id = p.object_id And idx.index_id = p.index_id
  Where ((obj.type = 'U' And idx.type in (2,6)) -- Non-clustered index on a table
    Or obj.type = 'V') -- All indexes on indexed views
    And obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
    And idx.is_disabled = 1 -- Don't rebuild indexes that are already online
    And idx.is_hypothetical = 0 -- Don't rebuild hypothetical indexes!
  Order By iif(idx.type = 6, 1, 2), obj.name, idx.name;

Execute sp_executesql @sql;


-- 2. Foreign-key constraints
-- Build a list of foreign keys constraints in the schema and generate statements to enable them with checking.
Select @sql = @sql + 'ALTER TABLE ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' WITH CHECK CHECK CONSTRAINT ' + QuoteName(fk.name) + ';' + CHAR(13)
  From sys.foreign_keys fk
  Join sys.objects obj ON obj.object_id = fk.parent_object_id
  Where obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
  Order By obj.name, fk.name;

--DBCC CheckConstraints('fForeignKey_Table'); -- Does a soft check of the constraints and returns any values which violate them.
Execute sp_executesql @sql;


-- 3. Statistics
-- Build a list of tables in the schema and generate statements to update the statistics on them.
Select @sql = @sql + 'UPDATE STATISTICS ' + QuoteName(@schema) + '.' + QuoteName(obj.name) + ' WITH COLUMNS;' + CHAR(13)
  From sys.objects obj
  Where obj.type = 'U' -- User defined
    AND obj.schema_id = (Select schema_id From sys.schemas Where name = @schema)
  Order By obj.name;

Execute sp_executesql @sql;
mendosi
  • 2,097
  • 12
  • 22