Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi,
I'm a newbie flirting with DRBD.
If I remove ethernet and plug it in again (more than 10 seconds apart),
it springs to life within a few seconds.
But If I do a "service network stop" and "service network start" which are
more than 10 seconds apart it doesn't re-establish the connection. (Same
happens if I do a "ifdown eth0" and "ifup eth0".
The connection parameters that I think that controls this behaviour are,
ping-int and ping-timeout which I've set to 10 and 5 respectively.
(The o/p of drbdadm dump from PC1 is given below. It is the same as in PC2)
The logs in /var/log/messages when I remove and connect ethernet:
Ethernet unplugged
==================
May 15 17:44:12 PC1 kernel: drbd0: PingAck did not arrive in time.
May 15 17:44:12 PC1 kernel: drbd0: peer( Secondary -> Unknown ) conn(
Connected
-> NetworkFailure ) pdsk( UpToDate -> DUnknown )
May 15 17:44:12 PC1 kernel: drbd0: Creating new current UUID
May 15 17:44:12 PC1 kernel: drbd0: asender terminated
May 15 17:44:12 PC1 kernel: drbd0: short read expecting header on sock:
r=-512
May 15 17:44:12 PC1 kernel: drbd0: tl_clear()
May 15 17:44:12 PC1 kernel: drbd0: Connection closed
May 15 17:44:12 PC1 kernel: drbd0: Writing meta data super block now.
May 15 17:44:13 PC1 kernel: drbd0: conn( NetworkFailure -> Unconnected )
May 15 17:44:13 PC1 kernel: drbd0: receiver terminated
May 15 17:44:13 PC1 kernel: drbd0: receiver (re)started
Ethernet plugged
================
May 15 17:45:21 PC1 kernel: drbd0: conn( WFConnection -> WFReportParams )
May 15 17:45:21 PC1 kernel: drbd0: Handshake successful: DRBD Network
Protocol version 86
May 15 17:45:21 PC1 kernel: drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDa
te )
May 15 17:45:21 PC1 kernel: drbd0: Writing meta data super block now.
May 15 17:45:21 PC1 kernel: drbd0: conn( WFBitMapS -> SyncSource ) pdsk(
UpToDate -> Inconsistent )
May 15 17:45:21 PC1 kernel: drbd0: Began resync as SyncSource (will sync 0
KB [0 bits set]).
May 15 17:45:21 PC1 kernel: drbd0: Resync done (total 1 sec; paused 0 sec;
0 K/sec)
May 15 17:45:21 PC1 kernel: drbd0: conn( SyncSource -> Connected ) pdsk(
Inconsistent -> UpToDate )
May 15 17:45:21 PC1 kernel: drbd0: Writing meta data super block now.
May 15 17:45:21 PC1 kernel: drbd0: aftr_isp( 0 -> 1 )
May 15 17:45:21 PC1 kernel: drbd0: aftr_isp( 1 -> 0 )
service network stop
====================
May 15 17:47:01 PC1 kernel: drbd0: PingAck did not arrive in time.
May 15 17:47:01 PC1 kernel: drbd0: peer( Secondary -> Unknown ) conn(
Connected
-> NetworkFailure ) pdsk( UpToDate -> DUnknown )
May 15 17:47:01 PC1 kernel: drbd0: Creating new current UUID
May 15 17:47:01 PC1 kernel: drbd0: asender terminated
May 15 17:47:01 PC1 kernel: drbd0: short read expecting header on sock:
r=-512
May 15 17:47:01 PC1 kernel: drbd0: tl_clear()
May 15 17:47:01 PC1 kernel: drbd0: Connection closed
May 15 17:47:01 PC1 kernel: drbd0: Writing meta data super block now.
May 15 17:47:01 PC1 kernel: drbd0: conn( NetworkFailure -> Unconnected )
May 15 17:47:01 PC1 kernel: drbd0: receiver terminated
May 15 17:47:01 PC1 kernel: drbd0: receiver (re)started
May 15 17:47:01 PC1 kernel: drbd0: conn( Unconnected -> WFConnection )
service network start
=====================
No response in messages from DRBD.
Dump of drbdadm dump
====================
# /etc/drbd.conf
common {
syncer {
rate 10M;
}
}
resource drbd0 {
protocol C;
on PC1 {
device /dev/drbd0;
disk /dev/mapper/vgroot-LogVol01;
address 192.168.13.110:7788;
meta-disk internal;
}
on PC2 {
device /dev/drbd0;
disk /dev/mapper/vgroot-LogVol01;
address 192.168.13.222:7788;
meta-disk internal;
}
net {
timeout 60;
connect-int 10;
ping-int 10;
ping-timeout 5;
max-buffers 2048;
max-epoch-size 2048;
ko-count 2;
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
disk {
on-io-error detach;
}
syncer {
rate 10M;
after drbd1;
al-extents 257;
}
startup {
wfc-timeout 15;
degr-wfc-timeout 15;
}
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
}
}
resource drbd1 {
protocol C;
on PC1 {
device /dev/drbd1;
disk /dev/mapper/vgroot-LogVol00;
address 192.168.13.110:7789;
meta-disk internal;
}
on PC2 {
device /dev/drbd1;
disk /dev/mapper/vgroot-LogVol00;
address 192.168.13.222:7789;
meta-disk internal;
}
net {
timeout 60;
connect-int 10;
ping-int 10;
ping-timeout 5;
max-buffers 2048;
max-epoch-size 2048;
after-sb-0pri discard-least-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
rr-conflict disconnect;
ko-count 2;
}
disk {
on-io-error detach;
}
syncer {
rate 10M;
al-extents 257;
}
startup {
wfc-timeout 15;
degr-wfc-timeout 15;
}
}
May Day!
--Regards
S.Balaji