0

we have HP server HP ProLiant ML10 Gen 9 with Ubuntu 20.04.4 LTS. We have enabled Raid 1 array for two HDD sized 2TB using Intel RST Raid configuration (which is an fake/firmware raid). Now my goal is to replace the faulty Drive and rebuild the Raid 1 array.

Below is the output of the Raid Status cat /proc/mdstat

surya@himalaya:~$ cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sda[1] sdb[0]
      1953511424 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sda[1](S) sdb[0](S)
      6320 blocks super external:imsm

unused devices: <none>

Below is the output of the HDD info lsblk

surya@himalaya:~$ lsblk
NAME                        MAJ:MIN RM  SIZE RO TYPE  MOUNTPOINT
loop0                         7:0    0 61.9M  1 loop  /snap/core20/1361
loop1                         7:1    0 67.9M  1 loop  /snap/lxd/22526
loop2                         7:2    0 55.5M  1 loop  /snap/core18/2284
loop3                         7:3    0 43.6M  1 loop  /snap/snapd/14978
loop4                         7:4    0 55.4M  1 loop  /snap/core18/2128
loop5                         7:5    0 43.6M  1 loop  /snap/snapd/15177
loop6                         7:6    0 67.2M  1 loop  /snap/lxd/21835
loop7                         7:7    0 61.9M  1 loop  /snap/core20/1376
sda                           8:0    0  1.8T  0 disk
└─md126                       9:126  0  1.8T  0 raid1
  ├─md126p1                 259:0    0  1.1G  0 part  /boot/efi
  ├─md126p2                 259:1    0  1.5G  0 part  /boot
  └─md126p3                 259:2    0  1.8T  0 part
    ├─ubuntu--vg-ubuntu--lv 253:0    0  100G  0 lvm   /
    └─ubuntu--vg-lv--0      253:1    0  1.7T  0 lvm   /home
sdb                           8:16   0  1.8T  0 disk
└─md126                       9:126  0  1.8T  0 raid1
  ├─md126p1                 259:0    0  1.1G  0 part  /boot/efi
  ├─md126p2                 259:1    0  1.5G  0 part  /boot
  └─md126p3                 259:2    0  1.8T  0 part
    ├─ubuntu--vg-ubuntu--lv 253:0    0  100G  0 lvm   /
    └─ubuntu--vg-lv--0      253:1    0  1.7T  0 lvm   /home
sr0                          11:0    1 1024M  0 rom

I used the below command to replace the faulty drive sdb as shown above.

mdadm --manage /dev/md126 --fail /dev/sdb and I shutdown the system and replaced the Harddrive in the same port.

now when I try to rebuild the array using this command mdadm --manage /dev/md126 --add /dev/sdb I get the below message.

root@himalaya:~# mdadm --manage /dev/md126 --add /dev/sdb
mdadm: Cannot add disks to a 'member' array, perform this operation on the parent container

now the output of cat /proc/mdstat is below.

root@himalaya:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sda[0]
      1953511424 blocks super external:/md127/0 [2/1] [U_]

md127 : inactive sda[0](S)
      3160 blocks super external:imsm

unused devices: <none>

I also tried to enter the the Intel ROM option in BIOS using (Ctrl + i) I have set the OROM UI normal delay to 4 seconds under SATA configuration in BIOS setting. but I couldn't get that screen to rebuild the array in BIOS. It would be a great help if someone can assist me on how to rebuild and restore the Raid 1 array.

1 Answer 1

0

So I answering my own question for everyone's benefit who has to deal with these type of fake raid controllers.

Here is what I found

Interestingly the md126 is not the main RAID array here, it is md127, so all I did was re-adding this new drive to md127 with:

mdadm --manage /dev/md127 --force --add /dev/sdb

and the Raid started to rebuild itself.

now the results of cat/proc/mdstat are:

root@himalaya:~# cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md126 : active raid1 sda[1] sdb[0]
      1953511424 blocks super external:/md127/0 [2/2] [UU]

md127 : inactive sdb[1](S) sda[0](S)
      6320 blocks super external:imsm

unused devices: <none>

And this changes were reflected in the BIOS screen as well. Intel RST RAID Volumes status was Normal.

Below are the list of commands I used to restore this RAID 1 Array successfully.

To check the raid status:

cat /proc/mdstat

Removing the failed disk: First we mark the disk as failed and then remove it from the array:

mdadm --manage /dev/md126 --fail /dev/sdb
mdadm --manage /dev/md126 --remove /dev/sdb

Then power down the system and replace the new drive:

shutdown -h now

Adding the new hard drive: First you must create the exact same partitioning as on /dev/sda:

sfdisk -d /dev/sda | sfdisk /dev/sdb

To check if both the harddrive are having the same partitioning:

fdisk -l

Next we add this drive to the RAID array (you can use md126 or md127 accordingly whichever is your main RAID array) below is the command I used:

mdadm --manage /dev/md127 --force --add /dev/sdb

That's it. You can now see the Raid started to rebuild.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .