Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
hello, I use drbd-0.7.7 with Redhat-7.3, The connection between the nodes of the cluster passes through a VPN link. I have frequent deconnections of the drbd, and sometimes it fails to reconnect automatically, and transitions to StandAlone state instead. It may be a bug. Here are the logs. In particular I'd like to know how I should interpret the 20-second delays between Unconnected --> WFConnection and WFConnection --> WFReportParams. Could a network outage cause this? If so how? Kind regards, ============================================================== Log of master ============================================================== Nov 3 19:57:19 kernel: drbd0: PingAck did not arrive in time. Nov 3 19:57:19 kernel: drbd0: drbd0_asender [29742]: cstate Connected --> NetworkFailure Nov 3 19:57:19 kernel: drbd0: asender terminated Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [1594]: cstate NetworkFailure --> BrokenPipe Nov 3 19:57:19 kernel: drbd0: short read expecting header on sock: r=-512 Nov 3 19:57:19 kernel: drbd0: worker terminated Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [1594]: cstate BrokenPipe --> Unconnected Nov 3 19:57:19 kernel: drbd0: Connection lost. Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [1594]: cstate Unconnected --> WFConnection Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate WFConnection --> WFReportParams Nov 3 19:57:39 kernel: drbd0: sock_sendmsg returned -104 Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate WFReportParams --> BrokenPipe Nov 3 19:57:39 kernel: drbd0: short sent HandShake size=80 sent=0 Nov 3 19:57:39 kernel: drbd0: Discarding network configuration. Nov 3 19:57:39 kernel: drbd0: worker terminated Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate BrokenPipe --> Unconnected Nov 3 19:57:39 kernel: drbd0: Connection lost. Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate Unconnected --> StandAlone ============================================================== Log of slave ============================================================== Nov 3 19:57:19 kernel: drbd0: meta connection shut down by peer. Nov 3 19:57:19 kernel: drbd0: drbd0_asender [31505]: cstate Connected --> NetworkFailure Nov 3 19:57:19 kernel: drbd0: asender terminated Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [25347]: cstate NetworkFailure --> BrokenPipe Nov 3 19:57:19 kernel: drbd0: short read expecting header on sock: r=-512 Nov 3 19:57:19 kernel: drbd0: worker terminated Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [25347]: cstate BrokenPipe --> Unconnected Nov 3 19:57:19 kernel: drbd0: Connection lost. Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [25347]: cstate Unconnected --> WFConnection ============================================================== Configuration ============================================================== global { minor-count 2; dialog-refresh 5; # 5 seconds } resource foo { protocol C; incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f"; startup { wfc-timeout 120; degr-wfc-timeout 0; # 2 minutes. } disk { on-io-error detach; } net { sndbuf-size 2M; timeout 100; connect-int 14; ping-int 14; on-disconnect reconnect; } syncer { rate 4M; group 1; al-extents 257; } on slave { device /dev/drbd0; disk /dev/cciss/c0d0p2 ; address 10.16.29.97:7788; meta-disk internal; } on master { device /dev/drbd0; disk /dev/cciss/c0d0p2 ; address 10.16.7.129:7788; meta-disk internal; } }