Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> > 16: cs:NetworkFailure ro:Primary/Unknown ds:UpToDate/DUnknown C r---d > > ns:24777340 nr:0 dw:87720268 dr:12029753 al:1082 bm:1582 lo:0 pe:23 ua:0 ap:23 ep:1 wo:b oos:0 > > So "it" is probably "hanging" on this one. > > kernel logs of drbd16? When I do "echo t> /proc/sysrq-trigger", I don't get drbd16 trace, I can only find it in the "runnable tasks" : Dec 8 07:24:17 z2-3 kernel: [8681874.086989] runnable tasks: Dec 8 07:24:17 z2-3 kernel: [8681874.086989] task PID tree-key switches prio exec-runtime sum-exec sum-sleep Dec 8 07:24:17 z2-3 kernel: [8681874.086990] ---------------------------------------------------------------------------------------------------------- Dec 8 07:24:17 z2-3 kernel: [8681874.086993] kvm 32275 388725280.270968 90612590587 120 0 0 0.000000 0.000000 0.000000 / Dec 8 07:24:17 z2-3 kernel: [8681874.086998] drbd16_receiver 30799 388725240.270539 144240968909 120 0 0 0.000000 0.000000 0.000000 / Dec 8 07:24:17 z2-3 kernel: [8681874.087003] lvchange 4969 388725240.272576 18560770061 120 0 0 0.000000 0.000000 0.000000 / Dec 8 07:24:17 z2-3 kernel: [8681874.087008] R bash 20328 388725240.274621 2113 120 0 0 0.000000 0.000000 0.000000 / > well, what is living on drbd16? z2-3:~# drbdsetup /dev/drbd16 show disk { size 20971520s; # bytes on-io-error detach; fencing dont-care _is_default; max-bio-bvecs 0 _is_default; } net { timeout 60 _is_default; # 1/10 seconds max-epoch-size 2048 _is_default; max-buffers 2048 _is_default; unplug-watermark 128 _is_default; connect-int 10 _is_default; # seconds ping-int 10 _is_default; # seconds sndbuf-size 131070 _is_default; # bytes rcvbuf-size 131070 _is_default; # bytes ko-count 0 _is_default; cram-hmac-alg "md5"; shared-secret "eae879cc293277b6ac97089d2edf288d2e97f49e"; after-sb-0pri discard-zero-changes; after-sb-1pri consensus; after-sb-2pri disconnect _is_default; rr-conflict disconnect _is_default; ping-timeout 5 _is_default; # 1/10 seconds } syncer { rate 61440k; # bytes/second after -1 _is_default; al-extents 257; } protocol C; _this_host { device minor 16; disk "/dev/all/426965fa-291d-4f2b-8aa7-6d990d272376.disk0_data"; meta-disk "/dev/all/426965fa-291d-4f2b-8aa7-6d990d272376.disk0_meta" [ 0 ]; address ipv4 10.10.0.3:11221; } _remote_host { address ipv4 10.10.0.1:11221; > > try to get those kernel logs of drbd16. > The lasts drbd logs I ha ve about drbd16 : repeated many times : Dec 4 11:12:39 z2-3 kernel: [8349976.181008] block drbd16: Restarting receiver thread Dec 4 11:12:39 z2-3 kernel: [8349976.181011] block drbd16: receiver (re)started Dec 4 11:12:39 z2-3 kernel: [8349976.181014] block drbd16: conn( Unconnected -> WFConnection ) Dec 4 11:13:17 z2-3 kernel: [8350015.044022] block drbd16: Handshake successful: Agreed network protocol version 90 Dec 4 11:13:17 z2-3 kernel: [8350015.045118] block drbd16: Peer authenticated using 16 bytes of 'md5' HMAC Dec 4 11:13:17 z2-3 kernel: [8350015.045124] block drbd16: conn( WFConnection -> WFReportParams ) Dec 4 11:13:17 z2-3 kernel: [8350015.045136] block drbd16: Starting asender thread (from drbd16_receiver [30799]) Dec 4 11:13:17 z2-3 kernel: [8350015.046092] block drbd16: data-integrity-alg: <not-used> Dec 4 11:13:17 z2-3 kernel: [8350015.046215] block drbd16: drbd_sync_handshake: Dec 4 11:13:17 z2-3 kernel: [8350015.046217] block drbd16: self 9922706A43335E97:E973E700CD85FF0F:8E697C7BA01FEA03:176B27FC60EE9351 bits:128 flags:0 Dec 4 11:13:17 z2-3 kernel: [8350015.046220] block drbd16: peer E973E700CD85FF0E:0000000000000000:8E697C7BA01FEA02:176B27FC60EE9351 bits:0 flags:0 Dec 4 11:13:17 z2-3 kernel: [8350015.046222] block drbd16: uuid_compare()=1 by rule 7 Dec 4 11:13:17 z2-3 kernel: [8350015.046397] block drbd16: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDate ) Dec 4 11:13:18 z2-3 kernel: [8350015.091276] block drbd16: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) Dec 4 11:13:18 z2-3 kernel: [8350015.091283] block drbd16: Began resync as SyncSource (will sync 512 KB [128 bits set]). Dec 4 11:13:18 z2-3 kernel: [8350015.156771] block drbd16: Resync done (total 1 sec; paused 0 sec; 512 K/sec) Dec 4 11:13:18 z2-3 kernel: [8350015.156777] block drbd16: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) Dec 4 12:32:39 z2-3 kernel: [8354776.164507] block drbd16: PingAck did not arrive in time. Dec 4 12:32:39 z2-3 kernel: [8354776.164535] block drbd16: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Dec 4 12:32:39 z2-3 kernel: [8354776.164543] block drbd16: asender terminated Dec 4 12:32:39 z2-3 kernel: [8354776.164546] block drbd16: short read expecting header on sock: r=-512 Dec 4 12:32:39 z2-3 kernel: [8354776.164548] block drbd16: Terminating asender thread Dec 4 12:32:39 z2-3 kernel: [8354776.164560] block drbd16: Creating new current UUID Dec 4 12:32:39 z2-3 kernel: [8354776.165268] block drbd16: Connection closed Dec 4 12:32:39 z2-3 kernel: [8354776.165272] block drbd16: conn( NetworkFailure -> Unconnected ) Then finally : Dec 4 12:34:19 z2-3 kernel: [8354876.960011] block drbd16: PingAck did not arrive in time. Dec 4 12:34:19 z2-3 kernel: [8354876.960041] block drbd16: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Dec 4 12:34:19 z2-3 kernel: [8354876.960049] block drbd16: asender terminated Dec 4 12:34:19 z2-3 kernel: [8354876.960052] block drbd16: Terminating asender thread Dec 4 12:34:19 z2-3 kernel: [8354876.960063] block drbd16: short read expecting header on sock: r=-512 Dec 4 12:34:25 z2-3 kernel: [8354882.636014] block drbd16: md_sync_timer expired! Worker calls drbd_md_sync(). Dec 4 12:34:25 z2-3 kernel: [8354882.636437] block drbd16: md_sync_timer expired! Worker calls drbd_md_sync(). Dec 4 12:34:25 z2-3 kernel: [8354882.636439] block drbd16: md_sync_timer expired! Worker calls drbd_md_sync(). Dec 4 12:34:25 z2-3 kernel: [8354882.636441] block drbd16: md_sync_timer expired! Worker calls drbd_md_sync(). Cheers, Maxence -- Maxence DUNNEWIND Contact : maxence at dunnewind.net Site : http://www.dunnewind.net 06 32 39 39 93 GPG : 18AE 61E4 D0B0 1C7C AAC9 E40D 4D39 68DB 0D2E B533 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: Digital signature URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20091208/83004a79/attachment.pgp>