The following article sums up the steps I used to replace a failed hard drive in a RAID 1 array on a Debian 7 (wheezy) dedicated server machine for one of our clients, though the article applies to any other gnu/Linux system.

The RAID 1 array has been built like this:

/dev/sda1 + /dev/sdb1 => RAID1 => /dev/md0
/dev/sda2 + /dev/sdb2 => RAID1 => /dev/md1

in my case “/dev/sda” has failed showing the following in “/proc/mdstat“:

cat /proc/mdstat

and I get:

Personalities : [raid1]
md1 : active raid1 sda2[1](F) sdb2[2]
      976106304 blocks super 1.2 [2/1] [_U]

md0 : active raid1 sda1[1](F) sdb1[2]
      522944 blocks super 1.2 [2/1] [_U]

REMOVING THE FAILED DRIVE

First of all, mark “/dev/sda1” as failed using the “mdadm” command:

mdadm --manage /dev/md0 --fail /dev/sda1

next, remove “/dev/sda1” from “/dev/md0“:

mdadm --manage /dev/md0 --remove /dev/sda1

check it’s removed using:

cat /proc/mdstat

mark “/dev/sda2” as failed too:

mdadm --manage /dev/md1 --fail /dev/sda2

and remove it from “/dev/md1“:

mdadm --manage /dev/md1 --remove /dev/sda2

check it’s removed using the following command:

cat /proc/mdstat

and you should see:

Personalities : [raid1]
md1 : active raid1 sda2[3] sdb2[2]
      976106304 blocks super 1.2 [2/1] [U_]
      [======>..............] recovery = 33.1% (323201664/976106304) finish=182.0min speed=59784K/sec

md0 : active raid1 sda1[3] sdb1[2]
      522944 blocks super 1.2 [2/2] [UU]

Next thing to do is to actually poweroff the machine and replace the failed “/dev/sda” drive with a new one.

poweroff || shutdown -h now

note that the new (replacing) drive MUST be at least the size of the failed one or else the rebuild will fail


ADD NEW HDD DRIVE

Once the machine is booted with the new drive attached to it, clone the partition table of “/dev/sdb” to the new “/dev/sda” drive using “sfdisk“:

sfdisk -d /dev/sdb | sfdisk /dev/sda

verify the partition tables match on both drives using “fdisk“:

fdisk -l

add “/dev/sda1” to “/dev/md0” and “/dev/sda2” to “/dev/md1“:

mdadm --manage /dev/md0 --add /dev/sda1
mdadm --manage /dev/md1 --add /dev/sda2

Once the above commands are executed, it will trigger a RAID synchronization and the progress of the sync can be seen in “/proc/mdstat“:

cat /proc/mdstat

or if you’d like to see it in real time use the following:

watch -d -n1 cat /proc/mdstat

EXTRA STUFF

It may be also needed to flush partitions superblock if the replacing drive was used in different RAID set-up

mdadm --zero-superblock /dev/sda1
mdadm --zero-superblock /dev/sda2

If you’re one of our Dedicated Server Hosting customers we can replace a failed hard drive in your RAID1 array on your dedicated server for you free of charge. Just contact us and some of our experts will complete your request immediately.

Leave a Reply

Your email address will not be published. Required fields are marked *