2

While setting up a new replica in an Availability Group on our reporting cluster a few of the databases were not seeded correctly. I tried manually seeding them using a recent backup and applying the latest log file backups, which worked for most of the databases but a few of them are stuck in Initializing / In Recovery on startup in SSMS. Even restarting the SQL Server service didn't help. Here are the steps I have tried and the resulting errors:

--Get a list of database not Online and not Restoring
select name,state,state_desc from sys.databases where state not in (0,1)

name                        state   state_desc
StackExchange.Aviation.Meta 2       RECOVERING

drop database [StackExchange.Aviation.Meta]
Msg 3702, Level 16, State 4, Line 21
Cannot drop database "StackExchange.Aviation.Meta" because it is currently in use.

RESTORE LOG [StackExchange.Aviation.Meta] FROM DISK = '\\Backups\SQL\_Trans\StackExchange.Aviation.Meta_LOG_20170901_043251.trn' WITH NORECOVERY;
Msg 1222, Level 16, State 24, Line 11
Lock request time out period exceeded.
Msg 3013, Level 16, State 1, Line 11
RESTORE LOG is terminating abnormally.

RESTORE DATABASE [StackExchange.Aviation.Meta] FROM DISK = '\\Backups\SQL\StackExchange.Aviation.Meta_FULL_COPY_ONLY_20170901_040130.bak' WITH NORECOVERY;
Msg 1222, Level 16, State 24, Line 12
Lock request time out period exceeded.
Msg 3013, Level 16, State 1, Line 12
RESTORE DATABASE is terminating abnormally.

RESTORE DATABASE [StackExchange.Aviation.Meta] with recovery
Msg 1222, Level 16, State 24, Line 11
Lock request time out period exceeded.
Msg 3013, Level 16, State 1, Line 11
RESTORE DATABASE is terminating abnormally.

ALTER DATABASE [StackExchange.Aviation.Meta] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
Msg 5061, Level 16, State 1, Line 5
ALTER DATABASE failed because a lock could not be placed on database 'StackExchange.Aviation.Meta'. Try again later.
Msg 5069, Level 16, State 1, Line 5
ALTER DATABASE statement failed.

Even stopping and restarting the sql service did not help, as it comes back in the same stuck state. What else can I do to re-seed this database?

Greg Bray
  • 1,053
  • 3
  • 13
  • 20

4 Answers4

2

Well, when I saw the error messages I figured the database was not added to the Availability group, but after checking the Availability Databases in SSMS on the new replica it was there with a pause symbol next to it. This means it was added but the data movement has stopped. I tried resuming data movement, but it was too far behind. So instead I had to remove it from the availability group and manually apply log files so it could start replicating again.

-- Remove database from Availability Group:    
Alter Database [StackExchange.Aviation.Meta] SET HADR OFF;

-- Apply t-logs to catch up. This can be done manually in SSMS or via:
RESTORE LOG [StackExchange.Aviation.Meta] FROM DISK = '\\Backups\SQL\_Trans\StackExchange.Aviation.Meta_LOG_20170901_043251.trn' WITH NORECOVERY;

-- Re-join database to availability group
ALTER DATABASE [StackExchange.Aviation.Meta] SET HADR AVAILABILITY GROUP = [SENetwork_RAG];
ALTER DATABASE [StackExchange.Aviation.Meta] SET HADR RESUME;

I should have noticed it was added, but the AG has over 300 DBs and was still in the process of seeding so it was missed among the noise. The "Lock request time out" error message also made me think that something had gone wrong in the seeding, but it was just a poorly worded message indicating the database was already part of the Availability Group.

Greg Bray
  • 1,053
  • 3
  • 13
  • 20
1

I had same "problem" on Windows server 2019, SQL server enterprice 2019.

These are the steps I did in order to add large database in AG:

  • Made full backup of DB and restore it on secondary server;
  • Tomorow I took DIFF backup and restore it with no recovery over existing DB on secondary;
  • Stop log backup on primary server and restored all logs over existing secondary DB.
  • Over add database to Avaibility group gui, starting add database to Avability Group with JOIN ONLY option;
  • DB added to ag !

After I take a look on AG dashboard, all looks fine, but secondary database was in Initializing / In Recovery status.

I did nothing but waiting to see what will happen after some time. All went good and secondary database was synchronized well after 10 minutes!

0

Obviously I've missed the boat here, but in case anyone else runs into this issue, I think the database is still recovery.
I had one move to "Initializing" status, with nothing being written to the SQL error log after that point (last entry was like "Recovery of database is 36% complete..."). I suspect the database was still restoring, but can't confirm since I switched the synchronous replica to async mode and restarted the host at the point, and after a 30 min recovery, the database in the AG started synchronizing.

rpc
  • 1
0

ISSUE On the DR server a database was showing "initializing\in recovery" for an extended period of time. When looking at the blocking, you could see this process SPID "hadr_recovery_wait_for_undo" You can't stop this SPID, and so you can't add the database to the AG, resume data movement, drop or restore the database as it is in use. Rebooting the secondary replica had no effect.

RESOLUTION Disable log backups on primary. Perform a full\log backup on primary.

Reboot the secondary and quickly drop the database when the servers comes back online. Restore the full\log backup on secondary. Add database to HADR on secondary.

Enable log backups on primary.