martes, 10 de septiembre de 2019

mdadm: how to add a disk to do a replacement

When playing with two disks raid 1 devices it is typical that when one drive fails you just remove it and replace it with a new one and that's it.
But... what if the drive that is left finds an error when you are reconstructing the array? Then... you have a problem.
So when replacing a disk that is starting to fail... it would be better to use the two disks instead of just discarding one.
Luckily the Linux software raid system allows you to add a third disk and sync the array to this disk from the other two, this way, you have two disks to feed the new one and if you are lucky enough you get a good copy of each of the sectors of the raid 1 array to finish the job.
The commands to do this would be... first to add the new disk:
mdadm --add /dev/md1 /dev/sdc1 which just adds it as an spare, but then you tell Linux you want it to use the three disks as active disks like this: mdadm --grow /dev/md1 -f -n 3 When this finishes you should have all the three disks on the array being active, or maybe you get failed drives in the way, but hopefully you get the new drive with a full copy of the data, so... all you have to do is get back to the two disks setup, for this if you don't have the failed drive marked as such... you fail it: mdadm --fail /dev/md1 /dev/sda1 and then you remove it from the array like this: mdadm --remove /dev/md1 /dev/sda1 and put the array in the two disks mode like it was before: mdadm --grow /dev/md1 -f -n 2 And that's it :-)
All this commands were tested on a Debian 10 (Buster) setup, hope they help you.
Regards.