5

I've configured a log shipping from a server, to the same server, only with a different instance.


Primary server is configured this way:

LSBACKUP_MyDatabase - every 25min

Secondary server:

LSCOPY_MyDatabase - every 1 minute

LSRESTORE_MyDatabase - every 10 minutes


Whats happening is, the primary backup job is running fine ( there is a lot more history, im just showing the last 2 ).

enter image description here

in the folder, I can see the TRN files.

enter image description here

In the secondary instance, LSCOPY and LSRESTORE is ok too. It's copying files, but the problem is here. The restore job is reporting this "message" ( the job runs succesfully, so I don't think it's a error):

Message

2015-12-29 09:10:02.41 Skipped log backup file. Secondary DB: 'MyDatabase', File: '\ServerIP\instancia g\BACKUP\Log_Sp_Secundario\MyDatabase_20151229104500.trn'

2015-12-29 09:10:02.41 Could not find a log backup file that could be applied to secondary database 'MyDatabase'.

2015-12-29 09:10:02.42 The restore operation was successful. Secondary Database: 'MyDatabase', Number of log backup files restored: 0

2015-12-29 09:10:02.42 Deleting old log backup files. Primary Database: 'MyDatabase'

2015-12-29 09:10:02.42 The restore operation was successful. Secondary ID: '5a0a361c-039c-40a3-9c39-af5e338c7f72'

And then, when I click to see the history of the LSALERT_ Job, its reporting errors with this message:

enter image description here

Message Executed as user: CMDO\gdladmin. The log shipping secondary database VMWGDLPRD04\GDLIC2014.GDL_IC has restore threshold of 45 minutes and is out of sync. No restore was performed for 8323 minutes. Restored latency is 0 minutes. Check agent log and logshipping monitor information. [SQLSTATE 42000] (Error 14421). The step failed.

According to one of the Microsoft's support pages, there is this query to show if there are gaps between logs. there are none:

SELECT 
    s.database_name,s.backup_finish_date,y.physical_device_name
FROM 
    msdb..backupset AS s INNER JOIN
    msdb..backupfile AS f ON f.backup_set_id = s.backup_set_id INNER JOIN
    msdb..backupmediaset AS m ON s.media_set_id = m.media_set_id INNER JOIN
    msdb..backupmediafamily AS y ON m.media_set_id = y.media_set_id
WHERE 
    (s.database_name = 'MyDatabase')
ORDER BY 
    s.backup_finish_date DESC;

I've been searching all over the internet, but I could only find those dba blogs with lack of information, and some posts saying that the primary database was dropped ( obviously was not ).

The primary instance is 2012. The secondary is 2014. the secondary database is in recovering mode.

To fix this, should I recreate all log shipping?

Paul White
  • 94,921
  • 30
  • 437
  • 687
Racer SQL
  • 7,546
  • 16
  • 77
  • 140

4 Answers4

5

Well it didn't fixed. I think I will recreate the log shipping again

Before , you give a try with that, why don't you go for looking the most recent differential backup and restore it on the secondary.

We also had situations as mentioned above and found that:

This happened due to NW glitch due to which the folder shared(on primary as common backup location with secondary) was no more shared ( due to some issues on cluster resource) and as a result few log backups never went/copied to secondary and since there was a gap, even though restore job completed but LS kept saying out of sync.

Well, in our case we went ahead and restored the latest full back up to bring the missing LSN chains in sync and later the restore job picked the next log backup file and LS was back in sync.

KASQLDBA
  • 7,203
  • 6
  • 30
  • 53
1

I take the LSAlert job with a grain of salt. Troubleshooting that issue made me pull my hair out! In the end, the log backup, copy, and restore jobs were functioning as expected.

Here are a few things to consider when you're getting that LSAlert error messages on your monitor server:

  • The date or time on the monitor server may be different from the date and time on the primary server. Also, it is possible that the system date or time was modified on the monitor or the primary server.
  • When the monitor server is offline and got back online, the fields in the log_shipping_primaries table are not updated with the current values before the alert message job runs.
  • The log shipping Copy job that is run on the primary server might not connect to the monitor server msdb database to update the fields in the log_shipping_primaries table. This may be the result of an authentication problem between the monitor server and the primary server.
  • The log shipping Restore job that is running on the secondary server cannot connect to the monitor server msdb database to update the log_shipping_secondaries table with the correct value. This may be the result of an authentication problem between the secondary server and the monitor server.
  • You may have set an small or incorrect value for the Backup Alert threshold. Ideally, you must set this value to such a value based on your SLA thresholds and frequency of the backup job.
  • Backup job on the primary server may be failing, in which case, we need to check the history of backup job and for any error messages in Primary server SQL errorlog.

There are couple of ways that you can verify that restores are happening as intended. The following query will help find gaps in the log restore process:

 SELECT 
    s.database_name,s.backup_finish_date,y.physical_device_name
FROM 
    msdb..backupset AS s INNER JOIN
    msdb..backupfile AS f ON f.backup_set_id = s.backup_set_id INNER JOIN
    msdb..backupmediaset AS m ON s.media_set_id = m.media_set_id INNER JOIN
    msdb..backupmediafamily AS y ON m.media_set_id = y.media_set_id
WHERE 
    (s.database_name = ‘databaseNamePrimaryServer’)
ORDER BY 
    s.backup_finish_date DESC;

Full disclosure, I'm not getting anything out of this "plug" and I don't work for Red Gate, it is a free real time monitoring tool that'll monitor log shipping.

Red Gate Log Shipping Monitor

Microsoft TechNet Article

AKDiscer
  • 1,395
  • 6
  • 28
  • 40
0

Try checking out the following tables - see if you have any records that shouldn't be there:

log_shipping_primary_databases log_shipping_secondary log_shipping_monitor_primary log_shipping_monitor_secondary

If you do, delete those and the alert will clear.

Some Guy
  • 41
  • 2
0

Thanks for posting some hints for troubleshooting. I also faced same issue and while troubleshooting didn't get good article and relevant help.

This post had similar issue and almost everything matched but still not able to solve. I resolved it by looking at the below table values and then updating it to fix this problem.

Primary Server

log_shipping_primary_databases log_shipping_monitor_primary

Secondary Server

log_shipping_secondary log_shipping_monitor_secondary

While looking at the log_shipping_monitor_secondary table i observed that the values in column "last_restored_file" and "last_restored_latency" was null due to which jobs were succeeding but SQL was clueless to pick the accurate file for restore thus i updated the "last_restored_file" to the last restored log file path from secondary server, which started working for me.

In some cases we may also need to update the "last_restored_latency" (which was my case) to fix this issue. While updating this field put the exact number of minutes since the last backup completed.

Hope this helps :)

Best Regards, Bipin Singh, SQL Server DBA