Search Our Database
Replace HDD for Software RAID in CentOS
When we have detected a hardware failure (by using SmartMonTools or other methods) in one of the HDD of a RAID1 (Mirror type) partition, we need to replace the failed HDD in order to make sure our data is kept intact. To do that, we need to perform some manual removal of the failed HDD, then perform manual addition of the new HDD.
Step 1: Assume we have 2 HDD, /dev/sda &/dev/sdb, with RAID partitions /dev/sda1 & /dev/sdb1 respectively, and have created a RAID device /dev/md0 by combining /dev/sda1 with /dev/sdb1.
Then, we assume that /dev/sdb has failed after checking it with SmartMonTools or by using the following command:
cat /proc/mdstat
The above command will show [UU] if both drives in a RAID device is functioning properly, while [U_] means that one of the drives have failed.
Step 2: Before we proceed to replace the HDD, we need to first mark the partitions from the failed drives, in this case /dev/sdb1 from /dev/sdb.
mdadm --manage /dev/md0 --fail /dev/sdb1
Do the same for other RAID devices that are using the RAID partitions of the failed HDD if you have any.
After marking them, we can safely remove (only works on failed/non-active partitions) the failed RAID partitions from the RAID device by typing:
mdadm --manage /dev/md0 --remove /dev/sdb1
Step 3: After removing all of the failed RAID partitions, power down the system using:
init 0
After shutting it down, proceed and replace the failed HDD with a new working one. Boot up the system after replacing.
Step 4: We now need to create the same partition from the still functioning HDD (sda) onto the newly replaced HDD (sdb) by typing the following command:
sfdisk -d /dev/sda | sfdisk /dev/sdb
After it is done, you can use fdisk to check if /dev/sda & /dev/sdb have the same partition.
Step 5: We can now re-add the newly created partition back to the RAID device by typing the command:
mdadm --manage /dev/md0 --add /dev/sdb1
Note: Remember to also add back the respective RAID partitions of other RAID devices if you have any.
Step 6: The new array /dev/md0 will begin synchronization. We can use “cat /proc/mdstat” to view the progress.
Step 7: Once the synchronization process is complete, check on the HDD status using the either “smartctl -h /dev/sdb” or the following command:
cat /proc/mdstat
If the status changed from [U_] to [UU], it means the process is a success.