Understanding why ghost version records in SQL server 2019 exists across application server restarts?

Question

Please bear with me as I'm not a DBA nor am I a SQL server expert.

We added an index to a table and noticed after about a day or so, the storage for the index grew to around 200 MB.
The reason for this, we surmised, is due to the high number of ghost records.
However when we do looked into it, we got the following results for the tables in the db:

TableName	record_count	ghost_record_count	version_ghost_record_count
Table1	19,936,533,500	5000	473,444,900
Table2	902,990,900	100	9,143,000
Table3	5,502.563,178	800	120,117,300
Table4	1,548,468,000	500	29,863,400

Seeing that the version ghost records is so big, we figured restarting the application (not the database) server might clear the records. This did not seem to be the case however. To be honest I am stumped as to what could be the reason for the persistence of the version ghost records. I'd appreciate any insight into how to troubleshoot/manage this. Or even some additional knowledge or past experience about this topic since I've found very little on the interwebs. Thank you.

Paul White · Accepted Answer · 2025-03-17T19:49:55.507

Version ghost records in particular are 'soft deleted' rows kept around because some currently executing statement or a long-running open transaction might need that version of the row to produce the correct results.

If you are lucky, this means that someone has left a transaction open for a long time that is (a) using a row versioning transaction isolation level like Read Committed Snapshot Isolation (RCSI) or Snapshot Isolation (SI); and (b) that transaction has accessed user data in the database.

A quick way to see if there are any open transactions is to run DBCC OPENTRAN;. Resolving that transaction will allow the ghost cleanup process to start to catch up.

SQL Server 2022 has an improvement for ghost cleanup so RCSI (statement-level snapshot) transactions can be cleaned up when the statement completes (not waiting for the transaction to end). Since you're not running SQL Server 2022, you're stuck with the old behaviour.

If you are unlucky, you have a Database Availability Group (AG) and there is either some problem with a replica, or there is a long running transaction on one of the replicas.

Queries on replicas always run under Snapshot Isolation, regardless of the isolation level requested. A long-running transaction on a replica might need old row versions to deliver correct results. Since the primary is an exact copy of the replicas, it cannot clean up old ghost versions until every replica has indicated no current transaction could possibly need that data.

You can see if a replica is holding up version ghost cleanup by querying the low_water_mark_for_ghosts column of sys.dm_hadr_database_replica_states. If that column isn't changing over the course of 30s or so, you almost certainly have a long-running replica transaction or some other problem with the replica(s) that is preventing ghost cleanup from making progress.

In the happy scenario, you will find a long-running transaction on a replica and resolve it as for the non-AG case. Otherwise, you have some sort of AG problem that will need expert attention.

There are other scenarios. You might simply have a very large number of databases on the one instance, or a tremendous rate of ghost production due to heavy delete activity (or updates performed by the engine as a delete-insert pair for various reasons, including replication and index key updates).

In other words, the rate of ghost production exceeds the capabilities of the cleanup process. There is not enough information in the question to say whether this applies or not. If other tables and databases on the same instance have deletes but no ghost problem, capacity probably isn't the issue.

score 7 · Answer 2 · answered Mar 17 '25 at 09:56

I think you need to start by reading up on ghost records & why they exist in the database, this may give you some insights into why with your application the numbers are so high, in the long run it seems likely to me you will need to make some architectural changes.

https://learn.microsoft.com/en-us/sql/relational-databases/ghost-record-cleanup-process-guide?view=sql-server-ver15

That said as ghost records are database constructs I'm not sure why you would think restarting the application server would make a difference, as you have found out it doesn't.

There is however a system stored procedure which can accomplish this task. You should read the documentation carefully as it is far from cost free.

https://learn.microsoft.com/en-us/sql/relational-databases/system-stored-procedures/sp-clean-db-file-free-space-transact-sql?view=sql-server-ver15

Understanding why ghost version records in SQL server 2019 exists across application server restarts?

2 Answers2