2

I am seeking suggestions and comments on how to recover from a 100% full vSAN (without the obvious, reset to factory option). I have an 8 node ESXi cluster which runs entirely on a vSAN backing. Due to circumstances with a vendor that I would prefer not to go in to, the total disk capacity was undersized for the storage requirements. With the end result of the vSAN hitting the 100% utilized wall hard and handling it about as well as an egg hitting a tile floor. Since the hosts themselves also boot from/live on the vSAN; when this condition occurred the hosts locked up and several of them crashed dramatically cutting the available disk size on an already full vSAN. I have been able to regain access to some of the hosts, but with the vSAN thrashing disk in a vain attempt to rebuild the array it is dreadfully slow to respond and vCenter is unavailable so I can only manage individual hosts using SSH & the vCenter thick client. This removes most of my controls over the vSAN object, so I've found my options to recover have been severely limited.

A few points:

  • I am well aware that filling any SAN technology to 100% capacity is a recipe for disaster so let's skip those obvious and unhelpful observations.
  • I understand and accept that data loss is pretty much inevitable here but I would like to save as much as I can while deleting what I need to in order to recover the cluster to a functional state.
  • The manufacturer has already advised that the cluster has to be reset to factory, but I've seen many cases where the community can provide better answers.
  • As the cluster is non-functional I am willing to take risks and try radical ideas that would normally be out of the question.
Mario Lenz
  • 1,632
  • 9
  • 13

0 Answers0