Search Our Database

Replace HDD for Software RAID in CentOS

Last updated on |
by

When we have detected a hardware failure (by using SmartMonTools or other methods) in one of the HDD of a RAID1 (Mirror type) partition, we need to replace the failed HDD in order to make sure our data is kept intact. To do that, we need to perform some manual removal of the failed HDD, then perform manual addition of the new HDD.

 

Step 1: Assume we have 2 HDD, /dev/sda &/dev/sdb, with RAID partitions /dev/sda1 & /dev/sdb1 respectively, and have created a RAID device /dev/md0 by combining /dev/sda1 with /dev/sdb1.

Then, we assume that /dev/sdb has failed after checking it with SmartMonTools or by using the following command:

cat /proc/mdstat

The above command will show [UU] if both drives in a RAID device is functioning properly, while [U_] means that one of the drives have failed.

 

Step 2: Before we proceed to replace the HDD, we need to first mark the partitions from the failed drives, in this case /dev/sdb1 from /dev/sdb.

mdadm --manage /dev/md0 --fail /dev/sdb1

Do the same for other RAID devices that are using the RAID partitions of the failed HDD if you have any.

After marking them, we can safely remove (only works on failed/non-active partitions) the failed RAID partitions from the RAID device by typing:

mdadm --manage /dev/md0 --remove /dev/sdb1

 

Step 3: After removing all of the failed RAID partitions, power down the system using:

init 0

After shutting it down, proceed and replace the failed HDD with a new working one. Boot up the system after replacing.

 

Step 4: We now need to create the same partition from the still functioning HDD (sda) onto the newly replaced HDD (sdb) by typing the following command:

sfdisk -d /dev/sda | sfdisk /dev/sdb

After it is done, you can use fdisk to check if /dev/sda & /dev/sdb have the same partition.

 

Step 5: We can now re-add the newly created partition back to the RAID device by typing the command:

mdadm --manage /dev/md0 --add /dev/sdb1

Note: Remember to also add back the respective RAID partitions of other RAID devices if you have any.

 

Step 6: The new array /dev/md0 will begin synchronization. We can use “cat /proc/mdstat” to view the progress.

 

Step 7: Once the synchronization process is complete, check on the HDD status using the either “smartctl -h /dev/sdb” or the following command:

cat /proc/mdstat

If the status changed from [U_] to [UU], it means the process is a success.