Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have a problem with network cards failing for resource 0 (r0). I thought it was the cheap network cards in both nodes. So, I replaced them with Intel Pro/1000 Gb cards. The connection worked at first and the sync finished without a problem. Then, after a few days, the connection went back to Primary/Unknown. I can't get ping through on that interface either. When I replaced the network cards I moved things around so the network cards for r0 were in a different pci slot. Any ideas on what may be going on here? Is this hardware issue? If so, any suggestions on a pci network card to use? Thanks, Tom /var/log/syslog: Aug 2 12:35:14 zan kernel: drbd0: PingAck did not arrive in time. Aug 2 12:35:14 zan kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) Aug 2 12:35:14 zan kernel: drbd0: Creating new current UUID Aug 2 12:35:14 zan kernel: drbd0: asender terminated Aug 2 12:35:14 zan kernel: drbd0: short read expecting header on sock: r=-512 Aug 2 12:35:14 zan kernel: drbd0: tl_clear() Aug 2 12:35:14 zan kernel: drbd0: Connection closed Aug 2 12:35:14 zan kernel: drbd0: Writing meta data super block now. Aug 2 12:35:14 zan kernel: drbd0: conn( NetworkFailure -> Unconnected ) Aug 2 12:35:14 zan kernel: drbd0: receiver terminated Aug 2 12:35:14 zan kernel: drbd0: receiver (re)started Aug 2 12:35:14 zan kernel: drbd0: conn( Unconnected -> WFConnection ) /proc/drbd: version: 8.0rc1 (api:86/proto:85) SVN Revision: 2644 build by tbrown at zan, 2007-01-05 08:49:02 0: cs:WFConnection st:Primary/Unknown ds:UpToDate/DUnknown C r--- ns:80300280 nr:0 dw:39778392 dr:162465666 al:20779 bm:6430 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:2980618 misses:3679 starving:0 dirty:0 changed:3679 act_log: used:0/257 hits:9923819 misses:22409 starving:0 dirty:1630 changed:20779 1: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- ns:500 nr:0 dw:280 dr:3844 al:1 bm:2 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:18 misses:2 starving:0 dirty:0 changed:2 act_log: used:0/257 hits:69 misses:1 starving:0 dirty:0 changed:1 2: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r--- ns:18336216 nr:0 dw:18318300 dr:266646710 al:55977 bm:659 lo:0 pe:0 ua:0 ap:0 resync: used:0/31 hits:6935 misses:659 starving:0 dirty:0 changed:659 act_log: used:0/257 hits:4523598 misses:65168 starving:0 dirty:9191 changed:55977 /etc/drbd.conf: global { usage-count yes; } common { syncer { rate 25M; } } resource r0 { protocol C; handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/sbin/drbd-peer-outdater"; } startup { wfc-timeout 20; degr-wfc-timeout 120; } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { al-extents 257; } on zan { device /dev/drbd0; disk /dev/hdd1; address 192.168.1.3:7788; meta-disk /dev/hdc1 [0]; } on jayna { device /dev/drbd0; disk /dev/hdd1; address 192.168.1.4:7788; meta-disk /dev/hdc1 [0]; } } resource r1 { protocol C; handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/sbin/drbd-peer-outdater"; } startup { wfc-timeout 20; degr-wfc-timeout 120; } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { after "r0"; al-extents 257; } on zan { device /dev/drbd1; disk /dev/hdd2; address 192.168.2.3:7789; meta-disk /dev/hdc2 [0]; } on jayna { device /dev/drbd1; disk /dev/hdd2; address 192.168.2.4:7789; meta-disk /dev/hdc2 [0]; } } resource r2 { protocol C; handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/sbin/drbd-peer-outdater"; } startup { wfc-timeout 20; degr-wfc-timeout 120; } disk { on-io-error detach; } net { after-sb-0pri disconnect; after-sb-1pri disconnect; after-sb-2pri disconnect; rr-conflict disconnect; } syncer { al-extents 257; } on zan { device /dev/drbd2; disk /dev/hdc4; address 192.168.3.3:7790; meta-disk /dev/hdc3 [0]; } on jayna { device /dev/drbd2; disk /dev/hdc4; address 192.168.3.4:7790; meta-disk /dev/hdc3 [0]; } }