0

I have just built a SQL Server 2022 Always on Availability group with 3 Nodes at HQ and 1 Node at DR.

During our testing phase I have noticed every time I rebooted the primary without failing over or moving the cluster resources from the primary I get error 1069 and 1205 in the cluster. My cluster validation came without any error but with some warnings as it is a multi subnet cluster and there was nothing on the cluster logs or sql error logs.

Before I contact Microsoft support I wanted to check if the errors were intended behavior of the cluster. The availability group always automatically failed over every time I rebooted the primary to simulate a failure.

Error 1069:

Cluster resource 'ABC_LIVEDB' of type 'SQL Server Availability Group' in clustered role 'ABC_LIVEDB' failed.

Error 1205:

The Cluster service failed to bring clustered role 'ABC_LIVEDB' completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered role.

I greatly appreciate the guidance.

Sean Gallardy
  • 38,135
  • 3
  • 49
  • 91
SQL_NoExpert
  • 1,107
  • 1
  • 21
  • 37

1 Answers1

0

Before I contact Microsoft support I wanted to check if the errors were intended behavior of the cluster.

This depends on your definition of intended behavior. Is it a behavior that can occur? Yes. Is it expected? Depends on the situation. Typically it would not be expected, though. The node should drain on a graceful shutdown, assuming there is a sync commit copy of the database on another instance, both instances are set for automatic failover, and the cluster continues to have quorum, then the role should transition properly.

To figure out if it should be expected or not, you'll need to investigate the SQL Server errorlog and the Windows Server Failover Cluster log. Depending on your environment and situation, the data there should give an idea of what happened was or was not expected.

Sean Gallardy
  • 38,135
  • 3
  • 49
  • 91