[DRBD-user] drbdmanage hangs frequently

dehacked drbit at dehacked.net
Thu Apr 19 15:25:02 CEST 2018


On 04/17/2018 03:43 AM, Roland Kammerer wrote:
> On Mon, Apr 16, 2018 at 12:44:22PM -0400, dehacked wrote:
>> Greetings,
>>
>> I have a small cluster used for Openstack (Newton on centos 7 nodes). I have
>> 2 main storage nodes, 1 openstack controller node and 5 'diskless'
>> hypervisors. It's configured with the hypervisors as satellite nodes and the
>> 3 remaining servers as management nodes with the management volume, though
>> only the 2 storage nodes actually hold the rest of the user data.
>>
>> I'm finding that drbdmanage hangs frequently trying to communicate with the
>> service. Even 'drbdmanage ping' will timeout. Examining the service process
>> I see it apparently busy connecting to another host which is itself hung.
>>
>> Any ideas what's wrong or what troubleshooting steps I should be taking here?
> 
> Usually this is a sign that at least one of them is busy and tries to do
> the same thing (e.g., create a resource, delete a resource,...) over and
> over again. Usually that stops after a fail-count is reached. But if it
> even takes longer than the TCP timeout we set, a node might not even be
> able to report back that it failed doing something. And then this loops.
> There have been fixes in that regard and the latest version has a
> configurable TCP timeout.
> 
> Enable debugging, check if you detect such a "busy loop" in the syslogs.

Thanks for the suggestion. This did help track it down.

The issue ended up being LVM related - all nodes were checking all block 
devices for LVM labels and were getting hung up on all the DRBD devices that 
were being created and in some cases not properly configured. Pruning the LVM 
filters fixed it all up.

Maybe these are possible bugs to be fixed? A node with no storage still runs 
vgscan, and opening /dev/drbdXX with no connections and diskless still waits 
for what I think is the autopromote timeout.

Although I only updated drbdmanage so far, so if some of these are already 
fixed, disregard.

> 
>> Thanks
>>
>> drbdmanage version 0.99.14
>> kernel driver version 9.0.9
>> drbd-utils version 9.1.1
>> all built from source tarballs
> 
> Every single one of them is outdated. At least try the latest drbdmange.
> 
> Regards, rck
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
> 


More information about the drbd-user mailing list