Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote: > On Mon, Dec 25, 2017 at 03:19:42PM +0100, Andreas Pflug wrote: > > Running two Debian 9.3 machines, directly connected via 10GBit on-board > > > > X540 10GBit, with 15 drbd devices. > > > > When running a 4.14.2 kernel (from sid) or a 4.13.13 kernel (from > > stretch-backports), I see several "Wrong magic value 0x4c414245 in > > protocol version 101" per day issued by the secondary, with subsequent > > termination of the connection, reconnect and resync. The magic value > > logged differs, quite often 0x00. > > > > Using the current 4.9.65 kernel (or older) from stretch didn't show > > these aborts in the past, and after going back they're gone again. It > > seems to be some problem introduced after 4.9 kernels, since both 4.9 > > and 4.13 include drbd 8.4.7. Maybe some interference with the nic driver? > > > > Kernel drbd ixgbe errors > > 4.9.65 8.4.7 4.4.0-k no > > 4.13.13 8.4.7 5.1.0-k yes > > 4.14.2 8.4.10 5.1.0-k yes > > "strange". > > What does "lsblk -D" and "lsblk -t" say? > > Do you have a scratch volume you can play with? > As a datapoint, you try to "blkdiscard /dev/drbdX" it? > dd if=/dev/zero of=/dev/drbdX bs=1G oflag=direct count=1? > > Something like that? > Any "easy" reproducer? Maybe while preparing the pull requests for upstream, we missed/mangled/broke something. Can you also reproduce with "out-of-tree" drbd 8.4.10? -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker DRBD® and LINBIT® are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed