Stumbled on dRAID recently and was grokking the docs for OpenZFS on linux and I came away with a conclusion from the chart that I wanted to double check.
This is a brand new system that I'm currently building from scratch and trying to plot out how to use my 5x12TB HDD disks. My priorities:
- Data Integrity
- Storage capacity
- I/O Performance
- Uptime (home lab use)
I was happy with the capacity and write performance of RAIDz1, but because of the size of these drives, I couldn't risk a second drive failure while rebuilding a failed one (my impression was that a 12TB drive could take a day or so, linked chart seems to agree). So I was forced to choose RAIDz2, which hits my overall capacity, but I'm fine with that. What I was worried about was write performance hit (I might just have to run tests...for another post).
Anyway, the chart on dRAID docs in the link above suggests that even for 5 drives, having parity 1 (equivalent to RAIDz1) and a distributed spare shortens rebuild time to just about 4 hours and greatly reduces the load on the single new drive. Faster rebuild for 1 drive and less write load seems to greatly diminish the risk of a second failed drive of this size while rebuilding the first as the burden is shared.
Question 1: I know that dRAID is for pools with many more disks, but other than complexity of setup (which I'll figure out), I only see upsides to choosing the dRAID route over RAIDz2.
Would you agree? Am I missing something about dRAID?
Question 2: More specifically, is the risk of a second drive failure reduced enough in the dRAID strategy so that I can get the capacity and performance back of only having 1 parity? My cake and eat it too?
Update April 2025:I finally dove in and figured out my own testing for this setup. Repository can be found here with tooling and test results. The tl;dr is that even though I decided to go double parity anyway for dRAID (draid2:3d:5c:0s), draid still showed nothing but affirmative reasons to use it instead of RAIDZ2.
