[DRBD-user] Testing local-io-error handler -- blkid hangs and ties up drbd device

Chris Dickson chrisd1100 at gmail.com
Thu Apr 12 21:51:14 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


I can confirm that the issue is neither present in drbd 8.3.13rc1 or 8.4.1
stable. The issue must have been a result of some of the code introduced
between the 8.4.1 release and the current master.

Chris

On Thu, Apr 12, 2012 at 11:18 AM, Chris Dickson <chrisd1100 at gmail.com>wrote:

> A little more info:
>
> If I set the the node with the good disk to primary, then write 100MB to
> the drbd volume, the drbd node with the bad disk calls my handler
> successfully, detaches and does not hang. It seems to only hang when I
> change the node with the bad disk's role to Primary.
>
>
> On Thu, Apr 12, 2012 at 9:40 AM, Chris Dickson <chrisd1100 at gmail.com>wrote:
>
>> Thanks Lars, dmesg indeed reported the exit code of 0:
>>
>> [  332.733554] block drbd575: role( Secondary -> Primary )
>> [  332.772827] block drbd575: disk( UpToDate -> Failed )
>> [  332.772840] block drbd575: Local IO failed in __req_mod. Detaching...
>> [  332.772925] block drbd575: helper command: /sbin/drbdadm
>> local-io-error minor-575
>> [  332.790163] block drbd575: helper command: /sbin/drbdadm
>> local-io-error minor-575 exit code 0 (0x0)
>> [  332.790189] block drbd575: disk( Failed -> Diskless )
>> [  332.803862] block drbd575: receiver updated UUIDs to effective data
>> uuid: 2B81D15C3E0ADD80
>>
>> The peer node is also locked up, all operations report:
>>
>> r575: State change failed: (-10) State change was refused by peer node
>>
>> One question on 8.3.latest, one of the reasons I wanted to use 8.4 was
>> the support for more minor numbers. It's not that I necessarily need more
>> than 256 on one machine, but the way my numbering system works it makes it
>> nice to be able to assign minor numbers greater than 255. Is there a quick
>> hack somewhere in the source that I can increase this limit or is this a
>> more complex change made for 8.4?
>>
>> Also the prefer-remote read balancing method is something that I was
>> interested in, but not super necessary.
>>
>> Thanks,
>>
>> Chris
>>
>> On Thu, Apr 12, 2012 at 9:24 AM, Lars Ellenberg <
>> lars.ellenberg at linbit.com> wrote:
>>
>>> On Thu, Apr 12, 2012 at 09:14:38AM -0400, Chris Dickson wrote:
>>> > Thanks for the quick reply,
>>> >
>>> > My test handler currently isn't doing anything interesting, I just had
>>> it
>>> > echo 'hello world' to a file which is located on a different drive
>>> than the
>>> > LVM volume. The echo seems to have completed successfully as the file
>>> is
>>> > written.
>>> >
>>> > The end goal for the handler is to simply insert a row into a remote
>>> DB,
>>> > other than that the default behavior on io-error of detaching is
>>> exactly
>>> > what I would like to have happen.
>>> >
>>> > I just tried filtering out drbd in lvm.conf and that doesn't seem to
>>> be the
>>> > issue. After another try I did a quick ps auxf this showed up:
>>> >
>>> > root       340  0.0  0.0  21392  1284 ?        Ss   12:59   0:00 udevd
>>> > --daemon
>>> > root       415  0.0  0.0  21384   896 ?        S    12:59   0:00  \_
>>> udevd
>>> > --daemon
>>> > root      1775  0.0  0.0   8448   724 ?        D    13:04   0:00  |
>>> \_
>>> > /sbin/blkid -o udev -p /dev/drbd575
>>> >
>>> > So it seems like udev is initiating the blkid call, could it be doing
>>> this
>>> > before drbd has finished executing the handler?
>>>
>>> If the handler finished,
>>> (drbd prints "... helper command .... exit code ..." to the kernel log).
>>> there is no reason for anything to hang.
>>>
>>> DRBD is supposed to retry failed local requests on the peer, and if that
>>> is not possible (no connection, or no good remote disk either), either
>>> freeze IO (if so configured) or report IO errors back up the stack.
>>>
>>> "Supposed to just work".
>>>
>>> Maybe rather downgrade to 8.3.latest, I know we fixed some issues
>>> in the retry logic on the way to 8.4.not-yet-but-"soon"-to-be-released.2
>>>
>>> --
>>> : Lars Ellenberg
>>> : LINBIT | Your Way to High Availability
>>> : DRBD/HA support and consulting http://www.linbit.com
>>> _______________________________________________
>>> drbd-user mailing list
>>> drbd-user at lists.linbit.com
>>> http://lists.linbit.com/mailman/listinfo/drbd-user
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120412/cec4b6d3/attachment.htm>


More information about the drbd-user mailing list