Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have a problem with network cards failing for resource 0 (r0). I thought it
was the cheap network cards in both nodes. So, I replaced them with Intel
Pro/1000 Gb cards. The connection worked at first and the sync finished
without a problem. Then, after a few days, the connection went back to
Primary/Unknown. I can't get ping through on that interface either. When I
replaced the network cards I moved things around so the network cards for r0
were in a different pci slot. Any ideas on what may be going on here? Is this
hardware issue? If so, any suggestions on a pci network card to use?
Thanks,
Tom
/var/log/syslog:
Aug 2 12:35:14 zan kernel: drbd0: PingAck did not arrive in time.
Aug 2 12:35:14 zan kernel: drbd0: peer( Secondary -> Unknown )
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Aug 2 12:35:14 zan kernel: drbd0: Creating new current UUID
Aug 2 12:35:14 zan kernel: drbd0: asender terminated
Aug 2 12:35:14 zan kernel: drbd0: short read expecting header on sock: r=-512
Aug 2 12:35:14 zan kernel: drbd0: tl_clear()
Aug 2 12:35:14 zan kernel: drbd0: Connection closed
Aug 2 12:35:14 zan kernel: drbd0: Writing meta data super block now.
Aug 2 12:35:14 zan kernel: drbd0: conn( NetworkFailure -> Unconnected )
Aug 2 12:35:14 zan kernel: drbd0: receiver terminated
Aug 2 12:35:14 zan kernel: drbd0: receiver (re)started
Aug 2 12:35:14 zan kernel: drbd0: conn( Unconnected -> WFConnection )
/proc/drbd:
version: 8.0rc1 (api:86/proto:85)
SVN Revision: 2644 build by tbrown at zan, 2007-01-05 08:49:02
0: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r---
ns:80300280 nr:0 dw:39778392 dr:162465666 al:20779 bm:6430 lo:0 pe:0 ua:0
ap:0
resync: used:0/31 hits:2980618 misses:3679 starving:0 dirty:0
changed:3679
act_log: used:0/257 hits:9923819 misses:22409 starving:0 dirty:1630
changed:20779
1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:500 nr:0 dw:280 dr:3844 al:1 bm:2 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:18 misses:2 starving:0 dirty:0 changed:2
act_log: used:0/257 hits:69 misses:1 starving:0 dirty:0 changed:1
2: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:18336216 nr:0 dw:18318300 dr:266646710 al:55977 bm:659 lo:0 pe:0 ua:0
ap:0
resync: used:0/31 hits:6935 misses:659 starving:0 dirty:0 changed:659
act_log: used:0/257 hits:4523598 misses:65168 starving:0 dirty:9191
changed:55977
/etc/drbd.conf:
global {
usage-count yes;
}
common {
syncer { rate 25M; }
}
resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/sbin/drbd-peer-outdater";
}
startup {
wfc-timeout 20;
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
al-extents 257;
}
on zan {
device /dev/drbd0;
disk /dev/hdd1;
address 192.168.1.3:7788;
meta-disk /dev/hdc1 [0];
}
on jayna {
device /dev/drbd0;
disk /dev/hdd1;
address 192.168.1.4:7788;
meta-disk /dev/hdc1 [0];
}
}
resource r1 {
protocol C;
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/sbin/drbd-peer-outdater";
}
startup {
wfc-timeout 20;
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
after "r0";
al-extents 257;
}
on zan {
device /dev/drbd1;
disk /dev/hdd2;
address 192.168.2.3:7789;
meta-disk /dev/hdc2 [0];
}
on jayna {
device /dev/drbd1;
disk /dev/hdd2;
address 192.168.2.4:7789;
meta-disk /dev/hdc2 [0];
}
}
resource r2 {
protocol C;
handlers {
pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f";
local-io-error "echo O > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/sbin/drbd-peer-outdater";
}
startup {
wfc-timeout 20;
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
al-extents 257;
}
on zan {
device /dev/drbd2;
disk /dev/hdc4;
address 192.168.3.3:7790;
meta-disk /dev/hdc3 [0];
}
on jayna {
device /dev/drbd2;
disk /dev/hdc4;
address 192.168.3.4:7790;
meta-disk /dev/hdc3 [0];
}
}