[DRBD-user] DRBD Behavior on Device Failure, and Recovery Procedure

Thu Apr 28 18:25:33 CEST 2022

Our servers have a large number of resources on a 6-drive volume group. When Linstor provisioned the resources, it apparently kept them all on individual devices. Here's a snippet of the approximately 200 resources on the servers. None of them show more than 1 device in the "Devices" column.

[root at ha51b ~]# lvs -o+lv_layout,stripes,devices
  LV            VG    Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Layout     #Str Devices
  site002_00000 vg0   -wi-ao----  104.02g                                                     linear        1 /dev/nvme3n1(371535)
  site003_00000 vg0   -wi-ao----  <63.02g                                                     linear        1 /dev/nvme0n1(498558)
  site017_00000 vg0   -wi-ao---- <149.04g                                                     linear        1 /dev/nvme3n1(724396)
  site019_00000 vg0   -wi-ao----  <19.01g                                                     linear        1 /dev/nvme4n1(0)
  site021_00000 vg0   -wi-ao----  <23.01g                                                     linear        1 /dev/nvme2n1(698275)
  site030_00000 vg0   -wi-ao----   39.01g                                                     linear        1 /dev/nvme2n1(704165)
  site034_00000 vg0   -wi-ao----  <23.01g                                                     linear        1 /dev/nvme3n1(713896)
  site035_00000 vg0   -wi-ao----   39.01g                                                     linear        1 /dev/nvme0n1(254527)
  site036_00000 vg0   -wi-ao----  <88.02g                                                     linear        1 /dev/nvme2n1(714152)
  site037_00000 vg0   -wi-ao----  <28.01g                                                     linear        1 /dev/nvme0n1(530822)
  site039_00000 vg0   -wi-ao----  <59.02g                                                     linear        1 /dev/nvme1n1(180777)
  site041_00000 vg0   -wi-ao----  <21.01g                                                     linear        1 /dev/nvme3n1(181290)
  site043_00000 vg0   -wi-ao----   50.01g                                                     linear        1 /dev/nvme3n1(398165)
  site045_00000 vg0   -wi-ao----   52.01g                                                     linear        1 /dev/nvme1n1(203567)
  site047_00000 vg0   -wi-ao----   54.01g                                                     linear        1 /dev/nvme0n1(264514)
  site049_00000 vg0   -wi-ao----  <81.02g                                                     linear        1 /dev/nvme3n1(410968)
  site058_00000 vg0   -wi-ao----  <30.01g                                                     linear        1 /dev/nvme0n1(564622)
  site062_00000 vg0   -wi-ao----   17.00g                                                     linear        1 /dev/nvme3n1(197679)
  site065_00000 vg0   -wi-ao----  <23.01g                                                     linear        1 /dev/nvme1n1(387935)
  site068_00000 vg0   -wi-ao----  <32.01g                                                     linear        1 /dev/nvme0n1(616090)
</snip>

With this layout (all LVs are linear), when a drive fails, I assume only the resources on that physical drive would go diskless, and all the other resources would continue operating normally, is that correct?

In such an event, what would be the recovery procedure? Swap the failed drive, use vgcfrestore to restore the LVM data to the new PV, then do a DRBD resync?

-Eric

Disclaimer : This email and any files transmitted with it are confidential and intended solely for intended recipients. If you are not the named addressee you should not disseminate, distribute, copy or alter this email. Any views or opinions presented in this email are solely those of the author and might not represent those of Physician Select Management. Warning: Although Physician Select Management has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20220428/55780937/attachment-0001.htm>