I recently set up RAID 1 on Linux with mdadm. When adding a new HDD to RAID 1, data started to sync between my drives which is expected. I didn't expect that it started syncing the entire drive, including unused space. The HDDs were 6 TB with only about 1 TB of data, so this took way longer than anticipated. Why did md have to sync the unused space?
Asked
Active
Viewed 3,457 times
14
Peter Mortensen
- 2,327
idunnololz
- 243
- 2
- 6
1 Answers
32
RAID works below the filesystem level - it doesn't know or care what parts of the disk are "used" or not, it just sees a bunch of blocks and their mirrored counterpart for RAID1.
So it has to sync the entire disk to make sure that they match. If it didn't, it wouldn't know what differences are an error and which ones are just parts that the filesystem doesn't think are used yet.
There is a --assume-clean flag you can use in mdadm to tell it not to do that - but you should only do that if you are certain that the disks contains nothing but zeros. And I think it only works for RAID1, not for RAID5/6.