Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, we recently had an issue with a stacked DRBD device (8.3.16) that started to block IO after switching from Ahead to SyncSource. ko-count is set to 6. Dec 8 03:35:08 node2 kernel: [668315.119697] block drbd20: helper command: /sbin/drbdadm before-resync-source minor-20 exit code 0 (0x0) Dec 8 03:35:08 node2 kernel: [668315.119706] block drbd20: conn( Ahead -> SyncSource ) pdsk( Consistent -> Inconsistent ) Dec 8 03:35:08 node2 kernel: [668315.119716] block drbd20: ASSERT( !(remote && send_oos) ) in /var/lib/dkms/drbd/8.3.16/build/drbd/drbd_req.c:1001 Dec 8 03:35:08 node2 kernel: [668315.119729] block drbd20: Began resync as SyncSource (will sync 216 KB [54 bits set]). Dec 8 03:35:08 node2 kernel: [668315.120419] block drbd20: updated sync UUID 024B346E4B84E12B:86C8E56E6CD2BBDC:9D97BCB66EBE838D:3E5876F017C7CDBD Dec 8 03:35:49 node2 kernel: [668356.840611] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:35:49 node2 kernel: [668356.865459] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:35:49 node2 kernel: [668356.903126] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:35:49 node2 kernel: [668356.930498] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:36:00 node2 kernel: [668367.006241] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:36:00 node2 kernel: [668367.030987] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:36:10 node2 kernel: [668377.249395] block drbd20: cs:SyncSource rs_left=55 > rs_total=54 (rs_failed 0) Dec 8 03:36:13 node2 kernel: [668380.608957] block drbd20: Remote failed to finish a request within ko-count * timeout Dec 8 03:36:13 node2 kernel: [668380.632397] block drbd20: peer( Secondary -> Unknown ) conn( SyncSource -> Timeout ) Dec 8 03:36:13 node2 kernel: [668380.632440] block drbd20: error receiving CsumRSRequest, l: 44! Dec 8 03:36:13 node2 kernel: [668380.645119] block drbd20: asender terminated Dec 8 03:36:13 node2 kernel: [668380.645131] block drbd20: Terminating drbd20_asender Dec 8 03:37:32 node2 kernel: [668459.482874] INFO: task jbd2/dm-4-8:9503 blocked for more than 120 seconds. Dec 8 03:37:32 node2 kernel: [668459.494628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Dec 8 03:37:32 node2 kernel: [668459.518881] jbd2/dm-4-8 D ffffffff81806240 0 9503 2 0x00000000 Dec 8 03:37:32 node2 kernel: [668459.518888] ffff881017c57ac0 0000000000000046 ffff881017c57a60 ffffffff8103ec29 Dec 8 03:37:32 node2 kernel: [668459.542394] ffff881017c57fd8 ffff881017c57fd8 ffff881017c57fd8 00000000000137c0 Dec 8 03:37:32 node2 kernel: [668459.565602] ffff8810197b4500 ffff88100a612e00 ffff881017c57a90 ffff88207fcb4080 Dec 8 03:37:32 node2 kernel: [668459.588736] Call Trace: Is this a known problem and fixed in DRBD 8.4? Thank you, Christoph