1

Our company is considering a new backup product that is an all in one sort of thing (file system, AD, SQL, etc). The product relies on VSS snapshots to create the full backups for the SQL databases. This causes an I/O freeze on the databases that usually just lasts a second or two, but can take up to 60 seconds (per Microsoft, I believe).

My question is how bad are I/O freezes for databases? Also, is there any way to determine how long the I/O freeze will be? I would think it would depend on a number of factors including: the size of the database, activity on it at the time, CPU usage, disk I/O, etc.

I have only been a DBA for less than a year and have not seen this myself, but my coworkers informed me that we have customers who do bulk data loads and in the past even a 1 second I/O freeze on the database will cause their jobs to fail.

We have voiced our concerns to management, but it is likely they will purchase the product.

Thanks for your help!

Erik Reasonable Rates Darling
  • 45,549
  • 14
  • 145
  • 532
Joey Jones
  • 13
  • 1
  • 3

1 Answers1

3

Yeah, this can be pretty bad! I've seen it go on for more than a minute.

There's no way to forecast it without testing it on your server, then checking the error log. You'll see messages for I/O being frozen, and I/O being thawed, along with timestamps.

NUTS

There are some showstoppers you might wanna consider, though.

Microsoft recommends not doing more than 35 databases per snapshot.

We recommend that you create a snapshot backup of fewer than 35 databases at the same time.

And cloud vendors like AWS who use snapshots set a hard limit of 30 databases per instance.

You can create up to 30 databases on each of your DB instances running Microsoft SQL Server. The Microsoft system databases, such as master and model, don't count toward this limit.

Aside from data being frozen for a long time, you'll want to make sure that your snaps are transactionally consistent (dirty snaps can cause corruption), and that they play nice with your native backups.

For instance, most people will take daily snaps, but still take native log backups to meet RPO. You want to make sure that after you mount a snapshot, you can still restore log backups to get you to a point in time. Otherwise, you're losing all the data in between snapshots.

Of course, if you're in Simple recovery model, or your RPO is pretty wide open, it matters a bit less.

Erik Reasonable Rates Darling
  • 45,549
  • 14
  • 145
  • 532