[DRBD-user] frequent wrong magic value with kernel >4.9

Andreas Pflug pgadmin at pse-consulting.de
Tue Jan 23 19:14:13 CET 2018


Am 15.01.18 um 16:37 schrieb Andreas Pflug:
> Am 09.01.18 um 16:24 schrieb Lars Ellenberg:
>> On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote:
>>> On Mon, Dec 25, 2017 at 03:19:42PM +0100, Andreas Pflug wrote:
>>>> Running two Debian 9.3 machines, directly connected via 10GBit on-board
>>>>
>>>> X540 10GBit, with 15 drbd devices.
>>>>
>>>> When running a 4.14.2 kernel (from sid) or a 4.13.13 kernel (from
>>>> stretch-backports), I see several "Wrong magic value 0x4c414245 in
>>>> protocol version 101" per day issued by the secondary, with subsequent
>>>> termination of the connection, reconnect and resync. The magic value
>>>> logged differs, quite often 0x00.
>>>>
>>>> Using the current 4.9.65 kernel (or older) from stretch didn't show
>>>> these aborts in the past, and after going back they're gone again. It
>>>> seems to be some problem introduced after 4.9 kernels, since both 4.9
>>>> and 4.13 include drbd 8.4.7. Maybe some interference with the nic driver?
>>>>
>>>> Kernel    drbd   ixgbe     errors
>>>> 4.9.65   8.4.7  4.4.0-k    no
>>>> 4.13.13  8.4.7  5.1.0-k    yes
>>>> 4.14.2   8.4.10 5.1.0-k    yes
>>> "strange".
>>>
>>> What does "lsblk -D" and "lsblk -t" say?
>>>
>>> Do you have a scratch volume you can play with?
>>> As a datapoint, you try to "blkdiscard /dev/drbdX" it?
>>> dd if=/dev/zero of=/dev/drbdX bs=1G oflag=direct count=1?
>>>
>>> Something like that?
>>> Any "easy" reproducer?
>> Maybe while preparing the pull requests for upstream,
>> we missed/mangled/broke something.
>>
>> Can you also reproduce with "out-of-tree" drbd 8.4.10?
>>
> So I have currently kernel 4.9.65 with drbd 8.3.7 on the primary server,
> with the second server (4.14.7 with drbd 8.3.11-rc1) having all drbd
> devices secondary.
>
> Llogged in kern.log on the secondary:
> Jan 15 15:13:22 xen2 kernel: [451977.741177] drbd monitor.opt: Wrong
> magic value 0x64656772 in protocol version 101

Any news on this issue, anything to test?
Still getting that message 20 times a day, system not really busy.

Regards,
Andreas



More information about the drbd-user mailing list