I've been reading a series of posts by Paul White about SQL Server Isolation Levels and came across a phrase:
To emphasise the point, pseudo-constraints written in T-SQL have to perform correctly no matter what concurrent modifications might be occurring. An application developer might protect a sensitive operation like that with a lock statement. The closest thing T-SQL programmers have to that facility for at-risk stored procedure and trigger code is the comparatively rarely-used
sp_getapplocksystem stored procedure. That is not to say it is the only, or even preferred option, just that it exists and can be the right choice in some circumstances.
I am using sp_getapplock and this made me wonder if I'm using it correctly, or there is a better way to get the desired effect.
I have a C++ application that processes so called "Building servers" in a loop 24/7. There is a table with the list of these Building Servers (about 200 rows). New rows can be added at any time, but it doesn't happen often. Rows are never deleted, but they can be marked as inactive. Processing a server may take from few seconds to dozens of minutes, each server is different, some are "small", some are "large". Once a server is processed, application has to wait at least 20 minutes before processing it again (the servers should not be polled too often). Application starts 10 threads that perform the processing in parallel, but I must guarantee that no two threads attempt to process the same server at the same time. Two different servers can and should be processed simultaneously, but each server can be processed not more often than once in 20 minutes.
Here is the definition of a table:
CREATE TABLE [dbo].[PortalBuildingServers](
[InternalIP] [varchar](64) NOT NULL,
[LastCheckStarted] [datetime] NOT NULL,
[LastCheckCompleted] [datetime] NOT NULL,
[IsActiveAndNotDisabled] [bit] NOT NULL,
[MaxBSMonitoringEventLogItemID] [bigint] NOT NULL,
CONSTRAINT [PK_PortalBuildingServers] PRIMARY KEY CLUSTERED
(
[InternalIP] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
CREATE NONCLUSTERED INDEX [IX_LastCheckCompleted] ON [dbo].[PortalBuildingServers]
(
[LastCheckCompleted] ASC
)
INCLUDE
(
[LastCheckStarted],
[IsActiveAndNotDisabled],
[MaxBSMonitoringEventLogItemID]
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
The main loop of a worker thread in an application looks like this:
for(;;)
{
// Choose building server for checking
std::vector<SBuildingServer> vecBS = GetNextBSToCheck();
if (vecBS.size() == 1)
{
// do the check and don't go to sleep afterwards
SBuildingServer & bs = vecBS[0];
DoCheck(bs);
SetCheckComplete(bs);
}
else
{
// Sleep for a while
...
}
}
Two functions here GetNextBSToCheck and SetCheckComplete are calling corresponding stored procedures.
GetNextBSToCheck returns 0 or 1 row with details of the server that should be processed next. It is a server that hasn't been processed for a longest time. If this "oldest" server has been processed less than 20 minutes ago, no rows are returned and thread will wait for a minute.
SetCheckComplete sets the time when processing was completed, thus making it possible to choose this server for processing again after 20 minutes.
Finally, the code of stored procedures:
GetNextToCheck:
CREATE PROCEDURE [dbo].[GetNextToCheck]
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE @VarInternalIP varchar(64) = NULL;
DECLARE @VarMaxBSMonitoringEventLogItemID bigint = NULL;
DECLARE @VarLockResult int;
EXEC @VarLockResult = sp_getapplock
@Resource = 'PortalBSChecking_app_lock',
@LockMode = 'Exclusive',
@LockOwner = 'Transaction',
@LockTimeout = 60000,
@DbPrincipal = 'public';
IF @VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Find BS that wasn't checked for the longest period
SELECT TOP 1
@VarInternalIP = InternalIP
,@VarMaxBSMonitoringEventLogItemID = MaxBSMonitoringEventLogItemID
FROM
dbo.PortalBuildingServers
WHERE
LastCheckStarted <= LastCheckCompleted
-- this BS is not being checked right now
AND LastCheckCompleted < DATEADD(minute, -20, GETDATE())
-- last check was done more than 20 minutes ago
AND IsActiveAndNotDisabled = 1
ORDER BY LastCheckCompleted
;
-- Start checking the found BS
UPDATE dbo.PortalBuildingServers
SET LastCheckStarted = GETDATE()
WHERE InternalIP = @VarInternalIP;
-- There is no need to explicitly verify if we found anything.
-- If @VarInternalIP is null, no rows will be updated
END;
-- Return found BS,
-- or no rows if nothing was found, or failed to acquire the lock
SELECT
@VarInternalIP AS InternalIP
,@VarMaxBSMonitoringEventLogItemID AS MaxBSMonitoringEventLogItemID
WHERE
@VarInternalIP IS NOT NULL
AND @VarMaxBSMonitoringEventLogItemID IS NOT NULL
;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
SetCheckComplete:
CREATE PROCEDURE [dbo].[SetCheckComplete]
@ParamInternalIP varchar(64)
AS
BEGIN
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
DECLARE @VarLockResult int;
EXEC @VarLockResult = sp_getapplock
@Resource = 'PortalBSChecking_app_lock',
@LockMode = 'Exclusive',
@LockOwner = 'Transaction',
@LockTimeout = 60000,
@DbPrincipal = 'public';
IF @VarLockResult >= 0
BEGIN
-- Acquired the lock
-- Completed checking the given BS
UPDATE dbo.PortalBuildingServers
SET LastCheckCompleted = GETDATE()
WHERE InternalIP = @ParamInternalIP;
END;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
As you can see, I use sp_getapplock to guarantee that only one instance of both of these stored procedures are running at any given time. I think I need to use sp_getapplock in both procedures, because the query that chooses the "oldest" server uses the LastCheckCompleted time, which is updated by SetCheckComplete.
I think that this code does guarantee that no two threads attempt to process the same server at the same time, but I'd be grateful if you could point to any issues with this code and the overall approach. So, the first question: Is this approach correct?
Also, I'd like to know if the same effect could be achieved without using sp_getapplock. The second question: Is there a better way?