13

I'd just like to know if there is any way to query when the AG group failed over.

eg - this is the Primary Replica now, but I'm pretty sure it was the secondary yesterday? How can i find when the failover took place

is there something specific in the Logs I should be looking for, or is there a tsql script to use?

Paul White
  • 94,921
  • 30
  • 437
  • 687
Trubs
  • 774
  • 1
  • 6
  • 14

7 Answers7

14

Following lists failover time and direction for each database for all failover events on the server on which this is run, using T-SQL as requested.

-- Script to determine failover times in Availability Group 

;WITH cte_HADR AS (SELECT object_name, CONVERT(XML, event_data) AS data
FROM sys.fn_xe_file_target_read_file('AlwaysOn*.xel', null, null, null)
WHERE object_name = 'error_reported'
)

SELECT data.value('(/event/@timestamp)[1]','datetime') AS [timestamp],
       data.value('(/event/data[@name=''error_number''])[1]','int') AS [error_number],
       data.value('(/event/data[@name=''message''])[1]','varchar(max)') AS [message]
FROM cte_HADR
WHERE data.value('(/event/data[@name=''error_number''])[1]','int') = 1480
youcantryreachingme
  • 1,655
  • 3
  • 21
  • 36
DonB
  • 141
  • 1
  • 2
13

You can use following PowerShell script to get last failover time

Get-winEvent -ComputerName ListnerName -filterHashTable @{logname ='Microsoft-Windows-FailoverClustering/Operational'; id=1641}| ft -AutoSize -Wrap 

1641 is the event ID within the Windows event logging system that indicates that a clustered role has moved from one node to another. The id=1641 part causes the Get-WinEvent command to filter for only those events.

Hannah Vernon
  • 70,928
  • 22
  • 177
  • 323
mashrur
  • 146
  • 1
  • 2
7

Although its difficult to find out when the last fail-over occurred, you can get a notification (alert) when the fail-over does occur... (you can then search your email?)

You can create an alert to let you (or your team) know when a fail-over occurred using Error Number 1480

1]

Note: This will send an email for every database in the AG (on each server), which in my case was 12 databases on 2 servers. On the options tab you can set Delay between responses: to something like 1 minute, so that only one email a minute is sent.

Trubs
  • 774
  • 1
  • 6
  • 14
4

You can set up a monitoring script (or a 3rd party SQL monitoring solution), to query the various columns of the sys.dm_hadr_availability_replica_states DMV on a regular schedule (e.g. every 30 seconds) and put a timestamp on it of when you've collected the values.

The particular columns of interest related to failover events are:

  • last_connect_error_number
  • last_connect_error_description
  • last_connect_error_timestamp

Additional columns that are worth tracking:

  • role
  • operational_state
  • connected_state

More detailed description for the above columns:

sys.dm_hadr_availability_replica_states (Transact-SQL)

An overview of all DMVs for AlwaysOn Availability Groups with links to more details is available on MSDN:

Monitor Availability Groups (Transact-SQL)

Paul White
  • 94,921
  • 30
  • 437
  • 687
3

If the failover is successful, then windows event logs would have an entry

The state of the local availability replica in availability group '' has changed from 'RESOLVING_NORMAL' to 'PRIMARY_PENDING'

The state of the local availability replica in availability group '' has changed from 'PRIMARY_PENDING' to 'PRIMARY_NORMAL'

This MS KB article has details about the error message and you can use PowerShell script to send you email based on specific events.

Apart from the windows event log, you can look at Cluster Diagnostic Extended Event Log with a format that resembles - ServerName_InstanceName_SQLDIAG_*.xel

Kin Shah
  • 62,545
  • 6
  • 124
  • 245
-1

I added a select (top(1), which is required to make the script in the other answer work:

declare @PrimaryServer varchar(50),
        @SecondaryServer varchar(50)

set @SecondaryServer = (select top(1) ar.replica_server_name from master.sys.availability_replicas ar join master.sys.dm_hadr_availability_replica_states ars on ar.replica_id = ars.replica_id where role_desc = 'SECONDARY')

set @PrimaryServer = (select top(1) ar.replica_server_name from master.sys.availability_replicas ar join master.sys.dm_hadr_availability_replica_states ars on ar.replica_id = ars.replica_id where role_desc = 'PRIMARY') select @@SERVERNAME as Servername, @PrimaryServer as PrimaryServer, @SecondaryServer as SecondayServer

--If this is the secondary server, do whatever If @@SERVERNAME = @SecondaryServer begin select @@SERVERNAME as Servername, @SecondaryServer as SecondayServer end Else begin select @@SERVERNAME as Servername, @PrimaryServer as PrimaryServer end

Paul White
  • 94,921
  • 30
  • 437
  • 687
-3

This the code I use to determine which server is the Primary node and which one is the secondary node. I use this in every job on my server.

NOTE: this is for a 2-node AOAG. Adjust it as needed to handle multiple-server AOAGs.

declare @PrimaryServer varchar(50),
        @SecondaryServer varchar(50)

set @SecondaryServer = (select ar.replica_server_name
                from master.sys.availability_replicas ar
                join master.sys.dm_hadr_availability_replica_states ars
                on ar.replica_id = ars.replica_id
                where role_desc = 'SECONDARY')

set @PrimaryServer = (select ar.replica_server_name
                from master.sys.availability_replicas ar
                join master.sys.dm_hadr_availability_replica_states ars
                on ar.replica_id = ars.replica_id
                where role_desc = 'PRIMARY')

--If this is the secondary server, do whatever
If @@SERVERNAME = @SecondaryServer 
    begin
        --Put your code here for Secondary Server actions
    end
Else
    begin
        --Put code here for Primary Server actions
    end

--End If