1

I got a server with Debian Jessie, 4 Drives sda to sdd all of which are partitioned likewise. The system is in a raid1 md raid over all drives. All drives do have grub and I can swap discs with each other, each one is bootable and the system boots up happy. All drives do contain exactly the same format:

  sdx1 - Boot Partition, GRUB installed
  sdx2 - Raid 1 /boot
  sdx3 - Raid 1 /
  sdx4 - Raid 10 swap
  sdx5 - non-md btrfs Raid 6 /data

The data partition is raid6 btrfs, I'm currently trying to upgrade my capacity by swapping out a drive for a bigger one, since I can have two fails my first instinct was to just replace one of the drives and boot back up, restore the failed raid arrays with the newly installed drive and after the rebuild everything is back to normal.

BUT the machine (which sadly is headless currently) does not boot once I swap the drives to something that invalidates the raid array. I can swap the discs with each other all day long and it happily boots. But if I remove a disc or swap in anything that is not part of the raid it fails to boot.

Am I missing something? How can I tell md that it is ok to boot with missing discs/degraded array automatically? In the end as far as md is concerned even one of the four discs can support the whole system by itself, the data partition is another beast as it needs at least two drives but md should not be concerned with that as that is a pure btrfs raid.

I know for the current usecase I could just remove the drive from the raid, upgrade it and then put it back there, but in the event of a fail I don't have the possibility to remove the drive if the system does fail to start up.

bardiir
  • 71

2 Answers2

1

As an update and the answer - in the meantime I figured out that the only thing really missing here was the nofail flag in fstab. The filesystem was degraded and it would not mount the filesystem in a degraded state without the nofail option beeing set.

bardiir
  • 71
0

As far as I know it is not yet possible to create a raid with mdadm which you can boot from without having separate boot partitions. I assume you set it up in a similar way as described here, it uses a raid10, but applies to other raid levels:

How to create a bootable redundant Debian system with a 3 or 4 (or more) disk software raid10?

It's possible you did not configure the other disks to be booted from in the bios? Or else the boot partitions are not exactly the same, that is exact copies with the same UUID.

To enable a specific disk to boot it will need to have a boot sector, and the bios needs to be configured to boot from it (along with a list of other boot disks that are part of the raid). Of course for a boot to complete successfully the disk will also need to have a boot partition. Since these boot partitions are not part of the raid each boot disk has its own. If you make sure each boot partition contains the exact same filesystem (using dd for example, to copy it over) and each disk has a boot sector created using the images on that boot partition the system should be able to boot from any of the disks. Even if the raid is degraded, a degraded raid should not prevent a successful boot. Otherwise that renders a big benefit of having a raid moot.

Quoting from the link:

Each disk that is part of the raid should have a bootable partition of about 1 GB that is NOT part of the raid. Create these partitions as normal, they have to be exactly the same size. Mark them as bootable, the mountpoint on one of the disks should be /boot, you can leave the others as unmounted.

Once you have used dd to make exact copies of the boot partition:

Now make sure that your bios is configured to attempt to boot from all 3 disks, order doesn't matter. As long as the bios will try to boot from any disk then in case one of the disks fails the system will automagically boot from the other disk because the UUIDs are exactly the same.

aseq
  • 4,740