5

This classic concurrency safety article is clearly designed for only upserting one row at a time. In my situation, I have a table-valued input and I want to upsert each row in a concurrency safe way. I know that this isn't always possible, but I want to get as close as possible. MERGE seems like a natural solution to it, but I distrust it and truly am in a situation where it is bug prone. The remaining two approaches in Michael J. Swart's article are:

  1. Inside a Transaction With Lock Hints (Update More Common)
CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
AS 
SET XACT_ABORT ON;
BEGIN TRAN

UPDATE TOP (1) dbo.AccountDetails WITH (UPDLOCK, SERIALIZABLE) SET Etc = @Etc WHERE Email = @Email;

IF (@@ROWCOUNT = 0) BEGIN
INSERT dbo.AccountDetails ( Email, Etc ) VALUES ( @Email, @Etc ); END COMMIT

  1. Inside a Transaction With Lock Hints (Insert More Common)
CREATE PROCEDURE s_AccountDetails_Upsert ( @Email nvarchar(4000), @Etc nvarchar(max) )
AS 
SET XACT_ABORT ON;
BEGIN TRAN

INSERT dbo.AccountDetails ( Email, Etc ) SELECT @Email, @Etc WHERE NOT EXISTS ( SELECT * FROM dbo.AccountDetails WITH (UPDLOCK, SERIALIZABLE) WHERE Email = @Email )

IF (@@ROWCOUNT = 0) BEGIN
UPDATE TOP (1) dbo.AccountDetails SET Etc = @Etc WHERE Email = @Email; END COMMIT

I could adapt either of these to use table variables (e.g. I suspect that IF (@@ROWCOUNT = 0) needs totally removing), but does the usage of a table-valued input make it obvious that we should prefer either the first or second solution? If not, then on what basis should the decision be made?

J. Mini
  • 1,161
  • 8
  • 32

2 Answers2

6

Do the update first, and then do an insert with a where not exists clause. You can’t test @@rowcount because some rows may have been updated.

If you do the insert first you’ll then update the rows you’ve just inserted too.

Rob Farley
  • 16,324
  • 2
  • 39
  • 61
2

Problems with MERGE usually manifest with combinations of features (especially relatively new ones) that result in highly complex data changing plans. This is a shame because MERGE is undoubtedly a convenient way to express things like an 'upsert'.

Workaround

You could consider the following:

  1. Create a view over the target table.
  2. Create INSTEAD OF trigger(s) for view INSERT and UPDATE actions.
  3. Write your 'upsert' as a MERGE against the view.

Advantages:

  1. The MERGE sees a simple target, so the data changing plan is not complex.
  2. SQL Server populates separate internal worktables for inserted and deleted*.
  3. The trigger code processes simple inserts and updates separately.
  4. You no longer need an explicit transaction around separate insert and update statements.

Disadvantages:

  1. The triggers add some overhead.
  2. You have to remember to target 'upserts' to the view.
  3. The OUTPUT clause won't work for inserted references.

This workaround allows you to write a convenient MERGE, but have the resulting actions performed as separate simple inserts and updates.

The usual precautions for concurrency on the MERGE are still needed.

I have used a view assuming you prefer not to have INSTEAD OF triggers on the base table.

* INSTEAD OF triggers do not use row versioning. See my article, Interesting things about INSTEAD OF triggers.

Demo

There are improvements to be made (like possibly combining the triggers). The following code prioritises clarity:

db<>fiddle

Table and sample rows

CREATE TABLE dbo.AccountDetails
(
    Email nvarchar(400) NOT NULL 
        CONSTRAINT PK_AccountDetails 
        PRIMARY KEY CLUSTERED,
    Created datetime NOT NULL 
        DEFAULT GETUTCDATE(),
    Etc nvarchar(max) NULL
);

INSERT dbo.AccountDetails (Email, Etc) VALUES (N'2792@somewhere.com', N'Original 2792'), (N'3129@somewhere.com', N'Original 3129'), (N'4726@somewhere.com', N'Original 4726'), (N'5766@somewhere.com', N'Original 5766');

View

CREATE VIEW 
    dbo.AccountDetails_Upsert
WITH SCHEMABINDING
AS
SELECT 
    Email, 
    Etc 
FROM dbo.AccountDetails;

INSTEAD OF INSERT view trigger

CREATE TRIGGER 
    AccountDetails_Upsert_InsteadOfInsert
ON dbo.AccountDetails_Upsert
INSTEAD OF INSERT 
AS
IF  ROWCOUNT_BIG() = 0 RETURN;

SET NOCOUNT, XACT_ABORT ON; SET ROWCOUNT 0;

IF NOT EXISTS (SELECT * FROM Inserted) RETURN;

-- Simple insert INSERT dbo.AccountDetails (Email, Etc) SELECT Email, Etc FROM Inserted;

INSTEAD OF UPDATE view trigger

CREATE TRIGGER 
    AccountDetails_Upsert_InsteadOfUpdate
ON dbo.AccountDetails_Upsert
INSTEAD OF UPDATE 
AS
IF  ROWCOUNT_BIG() = 0 RETURN;

SET NOCOUNT, XACT_ABORT ON; SET ROWCOUNT 0;

IF NOT EXISTS (SELECT * FROM Inserted) RETURN;

-- Simple update UPDATE AD SET Etc = I.Etc FROM Inserted AS I JOIN dbo.AccountDetails AS AD ON AD.Email = I.Email;

Table-valued MERGE

DECLARE @Changes AS table
(
    Email nvarchar(400) NOT NULL PRIMARY KEY,
    Etc nvarchar(max) NULL
);

INSERT @Changes (Email, Etc) VALUES -- Update some rows (N'2792@somewhere.com', N'Updated 2792'), (N'3129@somewhere.com', N'Updated 3129'), (N'4726@somewhere.com', N'Updated 4726'), (N'5766@somewhere.com', N'Updated 5766'), -- Insert some rows (N'a@example.com', N'Inserted a'), (N'b@example.com', N'Inserted b');

MERGE dbo.AccountDetails_Upsert WITH ( UPDLOCK, SERIALIZABLE, FORCESEEK ) AS V USING @Changes AS C ON C.Email = V.Email WHEN MATCHED THEN UPDATE SET Etc = C.Etc WHEN NOT MATCHED BY TARGET THEN INSERT (Email, Etc) VALUES (C.Email, C.Etc);

Results

Email Created Etc
2792@somewhere.com 2025-04-13 09:48:39.687 Updated 2792
3129@somewhere.com 2025-04-13 09:48:39.687 Updated 3129
4726@somewhere.com 2025-04-13 09:48:39.687 Updated 4726
5766@somewhere.com 2025-04-13 09:48:39.687 Updated 5766
a@example.com 2025-04-13 09:48:39.960 Inserted a
b@example.com 2025-04-13 09:48:39.960 Inserted b

MERGE plan

Merge plan

Tidy up

DROP VIEW IF EXISTS 
    dbo.AccountDetails_Upsert;

DROP TABLE IF EXISTS dbo.AccountDetails;

Paul White
  • 94,921
  • 30
  • 437
  • 687