Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
hello,
I use drbd-0.7.7 with Redhat-7.3, The connection between the nodes of
the cluster passes through a VPN link.
I have frequent deconnections of the drbd, and sometimes it fails to
reconnect automatically, and transitions to StandAlone state instead. It
may be a bug.
Here are the logs. In particular I'd like to know how I should interpret
the 20-second delays between Unconnected --> WFConnection and
WFConnection --> WFReportParams. Could a network outage cause this? If
so how?
Kind regards,
==============================================================
Log of master
==============================================================
Nov 3 19:57:19 kernel: drbd0: PingAck did not arrive in time.
Nov 3 19:57:19 kernel: drbd0: drbd0_asender [29742]: cstate Connected
--> NetworkFailure
Nov 3 19:57:19 kernel: drbd0: asender terminated
Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [1594]: cstate
NetworkFailure --> BrokenPipe
Nov 3 19:57:19 kernel: drbd0: short read expecting header on sock: r=-512
Nov 3 19:57:19 kernel: drbd0: worker terminated
Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [1594]: cstate BrokenPipe
--> Unconnected
Nov 3 19:57:19 kernel: drbd0: Connection lost.
Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [1594]: cstate Unconnected
--> WFConnection
Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate WFConnection
--> WFReportParams
Nov 3 19:57:39 kernel: drbd0: sock_sendmsg returned -104
Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate
WFReportParams --> BrokenPipe
Nov 3 19:57:39 kernel: drbd0: short sent HandShake size=80 sent=0
Nov 3 19:57:39 kernel: drbd0: Discarding network configuration.
Nov 3 19:57:39 kernel: drbd0: worker terminated
Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate BrokenPipe
--> Unconnected
Nov 3 19:57:39 kernel: drbd0: Connection lost.
Nov 3 19:57:39 kernel: drbd0: drbd0_receiver [1594]: cstate Unconnected
--> StandAlone
==============================================================
Log of slave
==============================================================
Nov 3 19:57:19 kernel: drbd0: meta connection shut down by peer.
Nov 3 19:57:19 kernel: drbd0: drbd0_asender [31505]: cstate Connected
--> NetworkFailure
Nov 3 19:57:19 kernel: drbd0: asender terminated
Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [25347]: cstate
NetworkFailure --> BrokenPipe
Nov 3 19:57:19 kernel: drbd0: short read expecting header on sock: r=-512
Nov 3 19:57:19 kernel: drbd0: worker terminated
Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [25347]: cstate BrokenPipe
--> Unconnected
Nov 3 19:57:19 kernel: drbd0: Connection lost.
Nov 3 19:57:19 kernel: drbd0: drbd0_receiver [25347]: cstate Unconnected
--> WFConnection
==============================================================
Configuration
==============================================================
global {
minor-count 2;
dialog-refresh 5; # 5 seconds
}
resource foo {
protocol C;
incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ;
halt -f";
startup {
wfc-timeout 120;
degr-wfc-timeout 0; # 2 minutes.
}
disk {
on-io-error detach;
}
net {
sndbuf-size 2M;
timeout 100;
connect-int 14;
ping-int 14;
on-disconnect reconnect;
}
syncer {
rate 4M;
group 1;
al-extents 257;
}
on slave {
device /dev/drbd0;
disk /dev/cciss/c0d0p2 ;
address 10.16.29.97:7788;
meta-disk internal;
}
on master {
device /dev/drbd0;
disk /dev/cciss/c0d0p2 ;
address 10.16.7.129:7788;
meta-disk internal;
}
}