0

I'm a junior DBA for a medium-sized company that helps merchants exchange money during sales. Our services operate 24/7, 365 days a year, so performance and uptime is absolutely critical. We have a log table in one our databases that hasn't been cleaned out since 2012. This log table provides customers with a quick lookup of previous sales without having to query multiple tables in our database. Per our policy, we are only required to provide 13 months of data, so we need to delete anything older than 13 months from the log.

Data is constantly being written to the log table, and any lock on the table will cause blocking throughout the database, causing an outage for the company. Our strategy is to swap out the log table with an empty copy. We then perform maintenance on the old copy, deleting all the old records and leaving behind just records in the last year. So far this has been a success, as the clean copy allows SQL transactions to flow while the table with all the records is offline.

My boss and I had a discussion about how to proceed with cleaning out the old log table. The log table alone is 1.2 terabytes of data, and 649,966,008 rows. My opinion is that we should just delete everything all at once, as the old log table is offline and won't cause any blocking. At the very least, we should delete blocks of one million rows, as many times as possible in a single workday. It gets the job done quickly with minimal impact.

My boss, the senior DBA, disagrees and is way more conservative in his approach. He thinks that deleting that much data, even from a table that's offline, has a large amount of risk. He says an operation that large can consume boatloads of memory which will push critical processes out of RAM, leading to slower performance in the database. He also says that deleting that much data has to be replicated across nodes in the availability group, causing further performance impact. He wants to delete at MOST two million rows a day, which will take over a year to

I think he's being overly conservative. I understand there's a place for caution as a DBA for a company that operates in a critical industry, but this seems like a burden. I don't think there is much risk in deleting everything all at once. There's no work that SQL server needs to do in joining multiple tables, so the memory usage should be small.

Who's more correct?

0 Answers0