1

We are a $3B company with a team of 6 infrastructure experts. I am a DBA, not part of infra team.

Our setup is all VMware ESX 5.1, EMC SAN for storage and ExaGrid for backups. Prod and non-prod servers are hosted in seperate DC's in 2 different cities. Prod backup share is replicated to non-prod share, the lag usually 4 - 8 hours. My non-prod database restores now take about 5 hours which costs us late night work and downtime. If I copy the backups to local drives first then restores complete in 1-2 hours. I requested an additional 500gb local drive on each of the 4 non-prod servers and the infrastructure team rejected it, saying it costs about $5k for 2TB. Fair enough.

In this case, I don't need any resiliency, fault tolerance, fault detection, mirroring, replication, backup, recoverability; none of that. Data is not important, all I need is reasonable speed for a few hours twice a week. The goal is to restore databases in 1 - 2 hours. I looked at RAM and CPU usage, and they are not the bottlenecks.

My question is: is there a way we can use these SSD's as a cheap additional storage as an alternative to the expensive SAN?

If yes, apart from the cost of the drives what are other costs involved?

Any other way to bring the cost down to say under $2k or even $1k?

Don
  • 21

4 Answers4

2

Yes, there is. You run into the typical enterprise idiocy of pushing everything to the SAN - something that will come and kill you mid term performance wise. There is a reason, for example. MS SQL Server allows LOCAL SSD for tempdb since 2012.... speed vs. cost. Heck, there are many cases where even production data can happily live on local discs without SAN resilience because you have an application level replication in place (for example : SQL Server Always On Availability Groups).

Basically: Your Infra team tries to solve everything by standardizing on a technology doing everything and expects you to pay. This is a perversion of their work - which would be to standardize on valid approaches for everything, and yes, having local temporary space is quite critical, especially for databases. And no, it does not need resilience.

Your particular SSD will work - but burn out quite fast likely. Still the concept is valid. I would likely get a couple of Samsung 843T ;)

TomTom
  • 52,109
  • 7
  • 59
  • 142
1

If all you need is a quick restore/rollback, you need local storage on the hosts, not an additional LUN on the SAN. Typically this is referred to as DAS (direct attached storage) and it can come in the form of externally attached storage box filled with drives, or an internal disk or ten.

The cheapest solution is an external USB drive, which can allow for a ~500Gb restore in ~5hrs in good conditions, USB speed at ~25mbps being the bottleneck.

An internal SSD or even 15k SAS (potentially a RAID array, for more IOPS) will be much faster to restore of course. For external access you'll need a SAS HBA, and a DAS appliance.

Keep in mind that these do not cancel the requirement for a proper backup/restore/DR scheme. The cost of these solutions can vary greatly, maybe even to the point where EMC LUNs come cheaper.

dyasny
  • 19,257
1

If your shop is anything like mine, here's what you do:

  • Define your needs. Don't include suggestions of how they might be met.
  • When they return with a cost, if it's more than you're willing to be charged back for what you need to do, then agitate with management.
Basil
  • 8,931
0

to find a quick solution I would recommend you to just ask for a DAS(direct attached storage)! Often the performance issue happened because the SAN is attached via a 1GB LAN or the disks are too slow for too many DB applications. A DAS will solve this issue, because you are the only one on this storage and you do not need to use any of this: fault tolerance, fault detection, mirroring, replication, backup, recoverability.

heinz
  • 1