3

I've been doing ALOT of reading up on ADR. However, I cannot find any metrics or even anecdotal evidence to indicate the improvement in recovery time one could expect with ADR.

For example, recently we were required to recover a DB. Phase 1 and 2 completed in approx 4 hours. Phase 3 completed in approx 14 hours.

Recovery was necessary as the 150GB log file ran out of disk space due to a massive transaction.

This article states that ADR provides instantaneous transaction rollback. Does this mean that phase 3 would have been instant rather than 14hrs? Would phase 1 and 2 also be significantly faster?

Any guidance people could offer would be greatly appreciated.

Paul White
  • 94,921
  • 30
  • 437
  • 687

1 Answers1

8

It is very much faster, to the point of appearing ✨ magical ✨.

The process and reasons are explained pretty well in the ADR documentation.

In your scenario, the log file wouldn't have run out of space in the first place (assuming you're not using transactional replication, snapshot replication, or change data capture) due to aggressive log truncation, which can clear the log despite any open and long-running transaction.

In any event, recovery is still designed to complete in constant time due to:

The persistent version store (PVS) is in the user database instead of tempdb. As the name suggests, this version store survives an instance restart so the stored data can be used for transaction rollback and general recovery.

A tiny secondary log (sLog) holds information about locks and currently non-versioned data like changes to system tables and internal metadata caching events.

Phase 1 & 2 complete very quickly (normally measured in seconds) due to only having to scan and redo the log from the last checkpoint (or oldest dirty page LSN), instead of from the start of the oldest active transaction. This makes a huge difference because the amount of log to redo is always small.

Phase 3 is practically instant because rolled-back data is served from the PVS (logical revert) instead of being undone from the log. A background task cleans up the PVS as reverted data is moved back from PVS to the main database (this has a small effect on overall performance until reversion is complete).

Test it

You should set up a test environment, start a long running transaction, generate significant log and see for yourself how much difference ADR makes.

You could either allow the test instance to run out of log naturally, or simply hard restart the environment whenever you feel like it (simulating a power outage or other catastrophic failure).


ADR isn't actually magical though, so there are trade-offs. But if you ever need to recover the database quickly after a failure or roll back a long-running transaction instantly, you likely won't care about those. See the linked documentation for details.

Make sure the user database has sufficient space for the PVS, which can become quite large.

Paul White
  • 94,921
  • 30
  • 437
  • 687