Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Am 09.01.18 um 16:24 schrieb Lars Ellenberg: > On Tue, Jan 09, 2018 at 03:36:34PM +0100, Lars Ellenberg wrote: >> On Mon, Dec 25, 2017 at 03:19:42PM +0100, Andreas Pflug wrote: >>> Running two Debian 9.3 machines, directly connected via 10GBit >>> on-board >>> >>> X540 10GBit, with 15 drbd devices. >>> >>> When running a 4.14.2 kernel (from sid) or a 4.13.13 kernel >>> (from stretch-backports), I see several "Wrong magic value >>> 0x4c414245 in protocol version 101" per day issued by the >>> secondary, with subsequent termination of the connection, >>> reconnect and resync. The magic value logged differs, quite often >>> 0x00. >>> >>> Using the current 4.9.65 kernel (or older) from stretch didn't >>> show these aborts in the past, and after going back they're gone >>> again. It seems to be some problem introduced after 4.9 kernels, >>> since both 4.9 and 4.13 include drbd 8.4.7. Maybe some >>> interference with the nic driver? >>> >>> Kernel drbd ixgbe errors 4.9.65 8.4.7 4.4.0-k no >>> 4.13.13 8.4.7 5.1.0-k yes 4.14.2 8.4.10 5.1.0-k yes >> >> "strange". >> >> What does "lsblk -D" and "lsblk -t" say? NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME sda 0 262144 262144 512 512 1 cfq 128 128 0B └─sda1 0 262144 262144 512 512 1 cfq 128 128 0B ├─local-stresstest 0 262144 262144 512 512 1 128 128 0B │ └─drbd16 0 262144 262144 512 512 1 128 128 0B NAME DISC-ALN DISC-GRAN DISC-MAX DISC-ZERO sda 0 0B 0B 0 └─sda1 0 0B 0B 0 ├─local-stresstest 0 0B 0B 0 │ └─drbd16 0 0B 0B 0 >> >> >> Do you have a scratch volume you can play with? As a datapoint, you >> try to "blkdiscard /dev/drbdX" it? blkdiscard: /dev/drbd16: BLKDISCARD ioctl failed: Operation not supported It's hosted on LVM on a hardware raid6 disk. >> >> dd if=/dev/zero of=/dev/drbdX bs=1G oflag=direct count=1? dd if=/dev/zero of=/dev/drbd16 bs=1M count=3072 oflag=direct several times gives ~300MB/s and no problem. This was executed on the primary server with 4.9.65 and the secondary 4.14.7 (stretch-backports). Seems that zeroes don't trigger the problem. > > Maybe while preparing the pull requests for upstream, we > missed/mangled/broke something. > > Can you also reproduce with "out-of-tree" drbd 8.4.10? Since my post to drbd-user didn't make it to the list for two weeks, I missed the week after christmas when everybody was on holidays, so the system is back in full production and I'm uncomfortable with doing too much testing. Regards, Andreas -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20180110/1e2c6453/attachment.htm>