[DRBD-user] Guide to growing DRBD on raw partitions under clustered LVM

Digimer lists at alteeve.ca
Tue Aug 5 15:13:41 CEST 2014

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 05/08/14 05:11 AM, Lars Ellenberg wrote:
> On Mon, Aug 04, 2014 at 10:06:01PM -0400, Digimer wrote:
>> Hi all,
>>
>>    Your friendly neighbourhood documentor decided to work out how to
>> grow storage in HA clusters using DRBD.
>>
>> https://alteeve.ca/w/Anvil!_Tutorial_2_-_Growing_Storage
>>
>>    It's based around the main Anvil! tutorial, but it should be easy
>> to adapt to pretty much any situation that is using DRBD on raw
>> partitions. In my case, I do this because it gets complicated mixing
>> LVM under DRBD and clustered LVM over DRBD.
>>
>>    Spelling/grammer/logic mistakes, please let me know. :)
>
> Many people find your tutorials.
> Because, well, you know ;-)
>
> So my guess is, people will find this tutorial soon,
> when they search for DRBD and resize,
> right next to the DRBD User's Guide.
>
> We will update that one, if we have not already.
>
> But it won't hurt to add a warning note to your tutorial
> about resizing as well:
> I seriously broke certain scenarios of resize,
> leading to data loss/corruption.
>
> I paste the commit that fixed it below.
> The issue is with *offline* resize with *internal* meta data,
> while intending to keep the data in place.
>
> For this scenario,
> drbdmeta up to and including 8.4.2 is ok afaik,
> drbdmeta in 8.4.3 and 8.4.4 is broken,
> drbdmeta from drbd-utils 8.9.0 and above is ok again.
>
> (No comment on your tutorial itself,
> I did not really read through it yet.)
>
> commit 4413afe8ed9819d45e7575c635e062ce2202cbbe
> Author: Lars Ellenberg <lars.ellenberg at linbit.com>
> Date:   Fri Jun 6 18:11:55 2014 +0200
>
>      drbdmeta: fix data corruption during offline resize
>
>      Regression introduced in 8.4.3, still present in 8.4.4.
>
>      When offline resizing internal meta data, drbdmeta forgot to
>      properly re-initialize the new meta data offsets in time,
>      and would move the old meta data into the existing data area,
>      starting with offset 0 instead, thereby corrupting the first part
>      of the data, up to the size of the old bitmap area.
>      (embeded disk image partition table, file system super block,
>      top level directory, all gone!)

Hiya,

I've got a pretty prominent warning to a) test first on a test system 
and b) ensure you have good backups. :)

As for resize risk; I decided to approach this a little differently, and 
I would be very interested in your opinion of it:

In the scenario I used, I added an extra 300 GB to the underlying disk 
(/dev/sda), and I have two partition backing two DRBD resources (sda5 -> 
drbd0 and sda6 -> drbd1). I wanted to split the new space evenly between 
the two resources, thus growing each DRBD resource by 150 GB.

Originally, I had planned to move sda6 down the disk by 150GB, but 
parted's 'move' command is deprecated in EL6 and gone in EL7. Even then, 
the move command refused to move sda6 because it didn't have a 
recognized FS on it.

So instead, I decided to migrate services to one node and withdraw the 
other, stopping DRBD entirely on the first node. I wiped the MD, deleted 
the partitions, recreated the partitions with the new geometry and 
rebooted.

When back up, I zero'ed out the start of the new partitions and then ran 
'create-md' on the new partitions. This would, as I understand it, 
create the new MD at the new end position on each node, correct? I then 
attached -> invalidated (paranoid, I am) -> connected and waited for the 
two resources to do a full resync. Once sync'ed, I migrated services 
over and repeated the process on the second node.

Once the second node was done, a quick 'pvresize' was enough to 
recognize the new space.

This approach, if I am correct, avoids the bug you mentioned because I 
didn't use DRBD's resize tool. If I am correct, the main downside to my 
approach is the time needed for a full resync, and the understanding 
that there is no redundancy until the resyncs complete.

Comments?

Thanks!

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without 
access to education?



More information about the drbd-user mailing list