2

So I am stuck in a corner, I have a storage project that is limited to 24 spindles, and requires heavy random Write (the corresponding read side is purely sequential). Needs every bit of space on my Drives, ~13TB total in a n-1 raid-5, and has to go fast, over 2GB/s sort of fast.

The obvious answer is to use a Stripe/Concat (Raid-0/1), or better yet a raid-10 in place of the raid-5, but that is disallowed for reasons beyond my control. So I am here asking for help in getting a sub optimal configuration to be as good as it can be.

The array built on direct attached SAS-2 10K rpm drives, backed by a ARECA 18xx series controller with 4GB of cache. 64k array stripes and an 4K stripe aligned XFS File system, with 24 Allocation groups (to avoid some of the penalty for being raid 5).

The heart of my question is this: In the same setup with 6 spindles/AG's I see a near disk limited performance on the write, ~100MB/s per spindle, at 12 spindles I see that drop to ~80MB/s and at 24 ~60MB/s.

I would expect that with a distributed parity and matched AG's, the performance should scale with the # of spindles, or be worse at small spindle counts, but this array is doing the opposite.

What am I missing ?

Should Raid-5 performance scale with # of spindles ?

Many thanks for your answers and any ideas, input, or guidance.

--Bill Edit: Improving RAID performance The other relevant thread I was able to find, discusses some of the same issues in the answers, though it still leaves me with out an answer on the performance scaling.

Bill N.
  • 123

3 Answers3

6

ANY RAID 5 is sub-optimal, a 24-disk R5 array is just beyond stupid, I don't mean to be rude but most hardware array controllers won't let you create a 24 disk R5 array, think of how much data you may be losing without even knowing it. Also if you're doing any amount of any type of writing RAID5 or 6 aren't the way forward, in fact adding more spindles is likely to just slow things down.

Both from a performance and reliability perspective you NEED to convert this to RAID 10 as soon as possible, it's really the only way forward, anything else is just polishing a turd.

Chopper3
  • 101,808
2

What are the different widely used RAID levels and when should I consider them?

RAID-5 at that scale could be problematic (rebuild times, increased chance of array failure). Your random write speed will not be reasonable at all (perhaps the bandwidth of a single disk), nor will it scale with spindles.

ewwhite
  • 201,205
1

Your limitation here is your raid controller. Random writes like a large cache, and parity-based RAIDs like a fast parity calculator. The more disks you add to a parity-based RAID array (like RAID 5 or 6) on a locally attached controller, the lower performance you'll see per spindle. Large direct attached storage (above 8 drives or so) tends to be raid 10 to avoid this issue. The only reason to use parity based RAID (5 or 6) is to benefit from a higher ratio of usable space. The downside is that if you lose a drive, the rebuild will take more time if the RAID was a large one.

No new hardware

If your hardware is fixed and you need to make this work as well as possible, then your best performance would be from RAID 10, however you'd lose half the available space. Also, while it seems more resilient at first because you can lose up to half the drives without a failure, you can still lose data if you lose the wrong two drives.

The other option for locally attached storage is ZFS. I don't claim to know its inner workings, but my understanding is that it will ignore the RAID card completely and work on the underlying disks. It might have a lower penalty on the parity for small writes if you configure it properly.

New hardware

If you have money to fix this problem, you would be well served to invest in a better RAID controller. Something with more cache and a faster processor. If you decide to use parity based RAID, you would be best served by making multiple RAIDs- preferably 3 groups with 8 spindles each.

Even better than this would be getting some external storage: something that has a storage controller which handles all the caching and parity calculation, and simply allows access to the storage via SAS or fibre channel.

Basil
  • 8,931