4

I have a 3ware 9650SE RAID controller with a RAID 5 array containing 15 Seagate ST31000340NS disks. After noticing ECC errors in the Port 10 drive I replaced it with a spare and began a RAID rebuild. Part way through the rebuild the Port 5 disk failed completely, which rendered the array inoperable because the new disk in Port 10 was incomplete. The array remained in use during the rebuild until the failure of the Port 5 disk. I hoped to recover the data by putting back the original Port 10 disk, but the RAID controller did not add it back to the array. Instead, it was listed as "available". My question is, how can I force the controller to recognise the original Port 10 disk in its original location? There is no "add disk" option in the 3dm2 interface.

* EXTRA INFO * Thanks for all the comments and suggestions relating to my original posting. I should have mentioned before that the array was mounted read-only during the rebuild. I don't know if that makes any difference to the chances of forcing the controller to accept the original disk back. There isn't a backup by the way. Whatever happens, I have certainly learned my lesson re. RAID5.

peterh
  • 5,017
Dan
  • 41
  • 1
  • 3

5 Answers5

4

I believe you are out of luck. This is one of the dangers of RAID5. Since the array was in use, all the other disks are now out of sync with the original port 10 disk.

updated: Regarding the update read-only mounting...Whether or not this works is really going to be an implementation detail of the 3ware. Even if you mounted read-only, the raid controller could have updated some metadata on the disks and decided this configuration is not recoverable. That's what I would expect.

kbyrd
  • 3,760
2

Your best option is to rebuild from backup. Since the array was in use, the data would be out of sync on the 10 disk.

RAID 5 is no longer really being recommended for use as drive sizes grow larger; the odds of an unrecoverable error on the drives are increasing, and aren't typically found until you have a disk fail on the RAID 5 array (which is when the second disk and it's latent bad spot is found).

1

You may be lucky if the error on the second drive is in an portion of the disk that is unused by the file system. So if you do not have any backups, you could try rebuilding with the "ignore ECC errors on rebuild" flag set. You would then want to run some form of consistency check over your file system and you may have to expect some data corruption in the worst case. Still, it may be preferrable to losing the entire volume.

kreil
  • 11
0

With todays disk sizes, the probability for another drive failure when one drive have already failed, is 62% when consumer disks: http://talkback.zdnet.com/5208-12694-0.html?forumID=1&threadID=36299&messageID=1008171

Don“t use raid 5, ever. If you must provide high availability and cheap storage, go for raid6 and a hot spare.

tore-
  • 1,396
  • 2
  • 10
  • 18
0

If your array stayed online and received writes after you removed the failed disk on port 10, then that means that disk is inconsistent with the rest of the array, and even if you could force it online any volumes on the array would be corrupt.

Don't ask me how I know this...

Restoring from backups is probably your only feasible option.

hmallett
  • 2,485