Can I rely on reading SQL Server Identity values in order?

Question

TL;DR: The question below boils down to: When inserting a row, is there a window of opportunity between the generation of a new Identity value and the locking of the corresponding row key in the clustered index, where an external observer could see a newer Identity value inserted by a concurrent transaction? (In SQL Server.)

Detailed version

I have a SQL Server table with an Identity column called CheckpointSequence, which is the key of the table's clustered index (which also has a number of additional nonclustered indexes). Rows are inserted into the table by several concurrent processes and threads (at isolation level READ COMMITTED, and without IDENTITY_INSERT). At the same time, there are processes periodically reading rows from the clustered index, ordered by that CheckpointSequence column (also at isolation level READ COMMITTED, with the READ COMMITTED SNAPSHOT option being turned off).

I currently rely on the fact that the reading processes can never "skip" a checkpoint. My question is: Can I rely on this property? And if not, what could I do to make it true?

Example: When rows with identity values 1, 2, 3, 4, and 5 are inserted, the reader must not see the row with value 5 prior to seeing the one with value 4. Tests show that the query, which contains an ORDER BY CheckpointSequence clause (and a WHERE CheckpointSequence > -1 clause), reliably blocks whenever row 4 is to be read, but not yet committed, even if row 5 has already been committed.

I believe that at least in theory, there may be a race condition here that might cause this assumption to break. Unfortunately, documentation on Identity doesn't say a lot about how Identity works in the context of multiple concurrent transactions, it only says "Each new value is generated based on the current seed & increment." and "Each new value for a particular transaction is different from other concurrent transactions on the table." (MSDN)

My reasoning is, it must work somehow like this:

A transaction is started (either explicitly or implicitly).
An identity value (X) is generated.
The corresponding row lock is taken on the clustered index based on the identity value (unless lock escalation kicks in, in which case the whole table is locked).
The row is inserted.
The transaction is committed (possibly quite a lot of time later), so the lock is removed again.

I think that between step 2 and 3, there is a very tiny window where

a concurrent session could generate the next identity value (X+1) and execute all the remaining steps,
thus allowing a reader coming exactly at that point of time to read the value X+1, missing the value of X.

Of course, the probability of this seems extremely low; but still - it could happen. Or could it?

(If you're interested in the context: This is the implementation of NEventStore's SQL Persistence Engine. NEventStore implements an append-only event store where every event gets a new, ascending checkpoint sequence number. Clients read events from the event store ordered by checkpoint in order to perform computations of all sorts. Once an event with checkpoint X has been processed, clients only consider "newer" events, i.e., events with checkpoint X+1 and above. Therefore, it is vital that events can never be skipped, as they'd never be considered again. I'm currently trying to determine if the Identity-based checkpoint implementation meets this requirement. These are the exact SQL statements used: Schema, Writer's query, Reader's Query.)

If I'm right and the situation described above could arise, I can see only two options of dealing with them, both of which are unsatisfactory:

When seeing a checkpoint sequence value X+1 before having seen X, dismiss X+1 and try again later. However, because Identity can of course produce gaps (e.g., when the transaction is rolled back), X might never come.
So, same approach, but accept the gap after n milliseconds. However, what value of n should I assume?

Any better ideas?

score 33 · Accepted Answer · answered Apr 13 '16 at 01:37

When inserting a row, is there a window of opportunity between the generation of a new Identity value and the locking of the corresponding row key in the clustered index, where an external observer could see a newer Identity value inserted by a concurrent transaction?

Yes.

The allocation of identity values is independent of the containing user transaction. This is one reason that identity values are consumed even if the transaction is rolled back. The increment operation itself is protected by a latch to prevent corruption, but that is the extent of the protections.

In the specific circumstances of your implementation, the identity allocation (a call to CMEDSeqGen::GenerateNewValue) is made before the user transaction for the insert is even made active (and so before any locks are taken).

By running two inserts concurrently with a debugger attached to allow me to freeze one thread just after the identity value is incremented and allocated, I was able to reproduce a scenario where:

Session 1 acquires an identity value (3)
Session 2 acquires an identity value (4)
Session 2 performs its insert and commits (so row 4 is fully visible)
Session 1 performs its insert and commits (row 3)

After step 3, a query using row_number under locking read committed returned the following:

In your implementation, this would result in Checkpoint ID 3 being skipped incorrectly.

The window of misopportunity is relatively small, but it exists. To give a more realistic scenario than having a debugger attached: An executing query thread can yield the scheduler after step 1 above. This allows a second thread to allocate an identity value, insert and commit, before the original thread resumes to perform its insert.

For clarity, there are no locks or other synchronization objects protecting the identity value after it is allocated and before it is used. For example, after step 1 above, a concurrent transaction can see the new identity value using T-SQL functions like IDENT_CURRENT before the row exists in the table (even uncommitted).

Fundamentally, there are no more guarantees around identity values than documented:

Each new value is generated based on the current seed & increment.
Each new value for a particular transaction is different from other concurrent transactions on the table.

That really is it.

If strict transactional FIFO processing is required, you likely have no choice but to serialize manually. If the application has less oneous requirements, you have more options. The question isn't 100% clear in that regard. Nevertheless, you may find some useful information in Remus Rusanu's article Using Tables as Queues.

score 8 · Answer 2 · answered Apr 13 '16 at 09:46

As Paul White answered absolutely correct there is a possibility for temporarily "skipped" identity rows. Here is just a small piece of code to reproduce this case for your own.

Create a database and a testtable:

create database IdentityTest
go
use IdentityTest
go
create table dbo.IdentityTest (ID int identity, c1 char(10))
create clustered index CI_dbo_IdentityTest_ID on dbo.IdentityTest(ID)

Perform concurrent inserts and selects on this table in a C# console program:

using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Threading;

namespace IdentityTest
{
    class Program
    {
        static void Main(string[] args)
        {
            var insertThreads = new List<Thread>();
            var selectThreads = new List<Thread>();

            //start threads for infinite inserts
            for (var i = 0; i < 100; i++)
            {
                insertThreads.Add(new Thread(InfiniteInsert));
                insertThreads[i].Start();
            }

            //start threads for infinite selects
            for (var i = 0; i < 10; i++)
            {
                selectThreads.Add(new Thread(InfiniteSelectAndCheck));
                selectThreads[i].Start();
            }
        }

        private static void InfiniteSelectAndCheck()
        {
            //infinite loop
            while (true)
            {
                //read top 2 IDs
                var cmd = new SqlCommand("select top(2) ID from dbo.IdentityTest order by ID desc")
                {
                    Connection = new SqlConnection("Server=localhost;Database=IdentityTest;Integrated Security=SSPI;Application Name=IdentityTest")
                };

                try
                {
                    cmd.Connection.Open();
                    var dr = cmd.ExecuteReader();

                    //read first row
                    dr.Read();
                    var row1 = int.Parse(dr["ID"].ToString());

                    //read second row
                    dr.Read();
                    var row2 = int.Parse(dr["ID"].ToString());

                    //write line if row1 and row are not consecutive
                    if (row1 - 1 != row2)
                    {
                        Console.WriteLine("row1=" + row1 + ", row2=" + row2);
                    }
                }
                finally
                {
                    cmd.Connection.Close();
                }
            }
        }

        private static void InfiniteInsert()
        {
            //infinite loop
            while (true)
            {
                var cmd = new SqlCommand("insert into dbo.IdentityTest (c1) values('a')")
                {
                    Connection = new SqlConnection("Server=localhost;Database=IdentityTest;Integrated Security=SSPI;Application Name=IdentityTest")
                };

                try
                {
                    cmd.Connection.Open();
                    cmd.ExecuteNonQuery();
                }
                finally
                {
                    cmd.Connection.Close();
                }
            }
        }
    }
}

This console prints a line for every case when one of the reading threads "misses" an entry.

score 5 · Answer 3 · edited Apr 13 '16 at 16:47

It is best to not expect the identities to be consecutive because there are many scenarios that can leave gaps. It is better to consider the identity like an abstract number and to not attach any business meaning to it.

Basically, gaps can happen if you roll back INSERT operations (or explicitly delete rows), and duplicates can occur if you set the table property IDENTITY_INSERT to ON.

Gaps can occur when:

Records are deleted.
An error has occurred when attempting to insert a new record (rolled back)
An update/insert with explicit value (identity_insert option).
Incremental value is more than 1.
A transaction rolls back.

The identity property on a column has never guaranteed:

• Uniqueness

• Consecutive values within a transaction. If values must be consecutive then the transaction should use an exclusive lock on the table or use the SERIALIZABLE isolation level.

• Consecutive values after server restart.

• Reuse of values.

If you cannot use identity values because of this, create a separate table holding a current value and manage access to the table and number assignment with your application. This does have the potential of impacting performance.

https://msdn.microsoft.com/en-us/library/ms186775(v=sql.105).aspx
https://msdn.microsoft.com/en-us/library/ms186775(v=sql.110).aspx

score 1 · Answer 4 · answered Mar 29 '16 at 06:06

I suspect that it occasionally can lead to trouble, troubles that become worse when the server is under heavy load. Consider two transaction:

T1: insert into T ... -- say 5 get inserted
T2: insert into T ... -- say 6 get inserted
T2: commit
Reader sees 6 but not 5
T1: commit

In the above scenario your LAST_READ_ID will be 6, so 5 will never be read.

score 0 · Answer 5 · answered Mar 29 '16 at 12:02

Running this script:

BEGIN TRAN;
INSERT INTO dbo.Example DEFAULT VALUES;
COMMIT;

Below are the locks I see acquired and released as captured by an Extended Event session:

name            timestamp                   associated_object_id    mode    object_id   resource_type   session_id  resource_description
lock_acquired   2016-03-29 06:37:28.9968693 1585440722              IX      1585440722  OBJECT          51          
lock_acquired   2016-03-29 06:37:28.9969268 7205759890195415040     IX      0           PAGE            51          1:1235
lock_acquired   2016-03-29 06:37:28.9969306 7205759890195415040     RI_NL   0           KEY             51          (ffffffffffff)
lock_acquired   2016-03-29 06:37:28.9969330 7205759890195415040     X       0           KEY             51          (29cf3326f583)
lock_released   2016-03-29 06:37:28.9969579 7205759890195415040     X       0           KEY             51          (29cf3326f583)
lock_released   2016-03-29 06:37:28.9969598 7205759890195415040     IX      0           PAGE            51          1:1235
lock_released   2016-03-29 06:37:28.9969607 1585440722              IX      1585440722  OBJECT          51

Note the RI_N KEY lock acquired immediately before the X key lock for the the new row being created. This short-lived range lock will prevent a concurrent insert from acquiring another RI_N KEY lock since RI_N locks are incompatible. The window you mentioned between steps 2 and 3 is not a concern because the range lock is acquired before the row lock on the newly generated key.

As long as your SELECT...ORDER BY begins the scan before the desired newly-inserted rows, I would expect the behavior you desire in the default READ COMMITTED isolation level as long as the database READ_COMMITTED_SNAPSHOT option is turned off.

John K. N. · Answer 6 · 2016-04-04T06:28:48.913

From my understanding of SQL Server the default behaviour is for the second query to not display any results until the first query has been committed. If the first query does a ROLLBACK instead of a COMMIT, then you will have a missing ID in your column.

Basic Configuration

Database Table

I created a database table with the following structure:

CREATE TABLE identity_rc_test (
    ID4VALUE INT IDENTITY (1,1), 
    TEXTVALUE NVARCHAR(20),
    CONSTRAINT PK_ID4_VALUE_CLUSTERED 
        PRIMARY KEY CLUSTERED (ID4VALUE, TEXTVALUE)
)

Database Isolation Level

I checked the isolation level of my database with the following statement:

SELECT snapshot_isolation_state, 
       snapshot_isolation_state_desc, 
       is_read_committed_snapshot_on
FROM sys.databases WHERE NAME = 'mydatabase'

Which returned the following result for my database:

snapshot_isolation_state    snapshot_isolation_state_desc   is_read_committed_snapshot_on
0                           OFF                             0

(This is the default setting for a database in SQL Server 2012)

Test Scripts

The following scripts were executed using the standard SQL Server SSMS client settings and the standard SQL Server settings.

Client connections settings

The client has been set to use the Transaction Isolation Level READ COMMITTED as per the Query Options in SSMS.

Query 1

The following query was executed in a Query window with the SPID 57

SELECT * FROM dbo.identity_rc_test
BEGIN TRANSACTION [FIRST_QUERY]
INSERT INTO dbo.identity_rc_test (TEXTVALUE) VALUES ('Nine')
/* Commit is commented out to prevent the INSERT from being commited
--COMMIT TRANSACTION [FIRST_QUERY]
--ROLLBACK TRANSACTION [FIRST_QUERY]
*/

Query 2

The following query was executed in a Query window with the SPID 58

BEGIN TRANSACTION [SECOND_QUERY]
INSERT INTO dbo.identity_rc_test (TEXTVALUE) VALUES ('Ten')
COMMIT TRANSACTION [SECOND_QUERY]
SELECT * FROM dbo.identity_rc_test

The query doesn't complete and is waiting for the eXclusive lock to be release on a PAGE.

Script to determine locking

This script displays the locking occurring on the database objects for the two transactions:

SELECT request_session_id, resource_type,
       resource_description, 
       resource_associated_entity_id,
       request_mode, request_status
FROM sys.dm_tran_locks
WHERE request_session_id IN (57, 58)

And here are the results:

58  DATABASE                    0                   S   GRANT
57  DATABASE                    0                   S   GRANT
58  PAGE            1:79        72057594040549300   IS  GRANT
57  PAGE            1:79        72057594040549300   IX  GRANT
57  KEY         (a0aba7857f1b)  72057594040549300   X   GRANT
58  KEY         (a0aba7857f1b)  72057594040549300   S   WAIT
58  OBJECT                      245575913           IS  GRANT
57  OBJECT                      245575913           IX  GRANT

The results show that query window one (SPID 57) has a Shared lock (S) on the DATABASE an Intended eXlusive (IX) lock on the OBJECT, an Intended eXlusive (IX) lock on the PAGE it wants to insert to and an eXclusive lock (X) on the KEY it is has inserted, but not yet committed.

Because of the uncommitted data, the second query (SPID 58) has a Shared lock (S) on the DATABASE level, an Intended Shared (IS) lock on the OBJECT, an Intended Shared (IS) lock on the page a Shared (S) lock on the KEY with a request status WAIT.

Summary

The query in the first query window executes without committing. Because the second query can only READ COMMITTED data it waits either until the timeout occurs or until the transaction has been committed in the first query.

This is from my understanding the default behaviour of Microsoft SQL Server.

You should observe that the ID is indeed in sequence for subsequent reads by SELECT statements if the first statement COMMITs.

If the first statement does a ROLLBACK then you will find a missing ID in the sequence, but still with the ID in ascending order (provided you created the INDEX with the default or ASC option on the ID column).

Update:

(Bluntly) Yes, you can rely on the identity column functioning correctly, until you encounter an issue. There is only one HOTFIX regarding SQL Server 2000 and the identity column on Microsoft's website.

If you couldn't rely on the identity column updating correctly, I think there would be more hotfixes or patches on Microsoft's website.

If you have a Microsoft Support Contract you could always open up an Advisory Case and ask for additional information.