3

We have an unusual database structure (I leave the reasoning for another question), that in its current form requires a regular truncation and recreation of a lot of rows (about 1 million).

We're working to re-factor it. However, as a short term solution, are there any query hints that could help in this situation?

Is there a way of making it so if reads are occurring on the table being truncated/repopulated that it ignores this fact i.e. it'll just read the data before the truncation began?

We're trying to avoid locks (we think).

I realise this isn't a long term solution, but looking into possible solutions.

Michael Green
  • 25,255
  • 13
  • 54
  • 100
Alex KeySmith
  • 297
  • 1
  • 4
  • 16

3 Answers3

4

Any solution proposed, or any solution you could envision for the mater, would suffer from the same issue: coalesce point around the schema modification lock.

Whether you do a truncate, or an alter table ...switch, or sp_rename, alter schema whateverver, it doesn't matter. They're all the same solution in disguise. And all will have a point in time when the table that is being truncated/ switched/ renamed/ transfered must be locked with SCH-M. This is, in theory, an 'instantaneous metadata only operation', but in practice the SCH-M cannot be granted until all the other existing locks on the table are released, which means any query that is already running on the table will have to finish first. Fine, the operation will wait until the existing queries 'drain'. The issue is that no other query can get in until the pending SCH-M is granted and released. This is a well know lock starvation issue mitigation. So all of the sudden all your new queries will freeze until the very last old query finishes, then the truncate/ switch/ transfer occurs, and the new queries finally resume. This is not noticeable if all queries last 1 second, but if the queries are reports/ analysis that last on average minutes the effect is very visible, specially so if you have a long tail.

The good news is that you can stop worrying about how to do what you're trying to do: it is impossible. The bad news is that you have to fix the application now.

Michael Green
  • 25,255
  • 13
  • 54
  • 100
Remus Rusanu
  • 52,054
  • 4
  • 96
  • 172
3

Going off of Martin's comment, you could have a second copy of the table that acts as a shadow. First create two schemas to facilitate round robin:

CREATE SCHEMA fake   AUTHORIZATION dbo;
CREATE SCHEMA shadow AUTHORIZATION dbo;

Now in the shadow schema create an identical table that you are currently truncating and re-populating:

CREATE TABLE shadow.whatever(cols);
-- add PK, indexes etc.

Then change your process so that, while users are querying the primary, it:

  • empties the copy
  • populates the empty copy
  • swaps the primary and copy

Sample code:

TRUNCATE TABLE shadow.whatever;

INSERT shadow.whatever([cols]) 
  SELECT [cols] FROM [source];

BEGIN TRANSACTION;

  ALTER SCHEMA fake TRANSFER     dbo.whatever;
  ALTER SCHEMA dbo  TRANSFER  shadow.whatever;

COMMIT TRANSACTION;

ALTER SCHEMA shadow TRANSFER fake.whatever;

This essentially swaps out the table under users' noses, but it is instantaneous - like ALTER TABLE / SWITCH, it is just a metadata change, so the transaction will wait until its turn and be done in a matter of microseconds, causing no visible blocking whatsoever (as long as none of your users are running transactions that require schema changes to whatever.

We've done similar things at my previous job, and I go into a lot of detail in this blog post:

http://www.sqlperformance.com/2012/08/t-sql-queries/t-sql-tuesday-schema-switch-a-roo


With Remus' observation in mind, and assuming that your reporting queries do take a long time, and that you can enforce data access through stored procedures, you could augment this a little by keeping two copies of the data, and alternating between the current "active" table on a schedule that is a bit longer than the longest query typically takes. By the time you populate the backup copy, all of the queries on the current copy should be done. You mark in a table when the backup copy is complete, and the stored procedure checks the table to determine which copy it will use. This will reduce (but not completely eliminate) the possibility for contention around the Sch-M lock, of course at the cost of storing multiple copies of the data and having a slightly more complex process. You would think stats and plans would go to hell but this is already the case in your current approach and the simpler approach I suggested above.

Aaron Bertrand
  • 181,950
  • 28
  • 405
  • 624
1

Are you using TRUNCATE or DELETE? If you're using DELETE you could used the WITH (ROWLOCK) hint. We had to do that on an audit table of ours. We keep 3 months of data in there but it's a 24/7 used DB so without the ROWLOCK hint we had too much blocking.

Thanks,

Simon