Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, I'm a newbie flirting with DRBD. If I remove ethernet and plug it in again (more than 10 seconds apart), it springs to life within a few seconds. But If I do a "service network stop" and "service network start" which are more than 10 seconds apart it doesn't re-establish the connection. (Same happens if I do a "ifdown eth0" and "ifup eth0". The connection parameters that I think that controls this behaviour are, ping-int and ping-timeout which I've set to 10 and 5 respectively. (The o/p of drbdadm dump from PC1 is given below. It is the same as in PC2) The logs in /var/log/messages when I remove and connect ethernet: Ethernet unplugged ================== May 15 17:44:12 PC1 kernel: drbd0: PingAck did not arrive in time. May 15 17:44:12 PC1 kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) May 15 17:44:12 PC1 kernel: drbd0: Creating new current UUID May 15 17:44:12 PC1 kernel: drbd0: asender terminated May 15 17:44:12 PC1 kernel: drbd0: short read expecting header on sock: r=-512 May 15 17:44:12 PC1 kernel: drbd0: tl_clear() May 15 17:44:12 PC1 kernel: drbd0: Connection closed May 15 17:44:12 PC1 kernel: drbd0: Writing meta data super block now. May 15 17:44:13 PC1 kernel: drbd0: conn( NetworkFailure -> Unconnected ) May 15 17:44:13 PC1 kernel: drbd0: receiver terminated May 15 17:44:13 PC1 kernel: drbd0: receiver (re)started Ethernet plugged ================ May 15 17:45:21 PC1 kernel: drbd0: conn( WFConnection -> WFReportParams ) May 15 17:45:21 PC1 kernel: drbd0: Handshake successful: DRBD Network Protocol version 86 May 15 17:45:21 PC1 kernel: drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> UpToDa te ) May 15 17:45:21 PC1 kernel: drbd0: Writing meta data super block now. May 15 17:45:21 PC1 kernel: drbd0: conn( WFBitMapS -> SyncSource ) pdsk( UpToDate -> Inconsistent ) May 15 17:45:21 PC1 kernel: drbd0: Began resync as SyncSource (will sync 0 KB [0 bits set]). May 15 17:45:21 PC1 kernel: drbd0: Resync done (total 1 sec; paused 0 sec; 0 K/sec) May 15 17:45:21 PC1 kernel: drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) May 15 17:45:21 PC1 kernel: drbd0: Writing meta data super block now. May 15 17:45:21 PC1 kernel: drbd0: aftr_isp( 0 -> 1 ) May 15 17:45:21 PC1 kernel: drbd0: aftr_isp( 1 -> 0 ) service network stop ==================== May 15 17:47:01 PC1 kernel: drbd0: PingAck did not arrive in time. May 15 17:47:01 PC1 kernel: drbd0: peer( Secondary -> Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown ) May 15 17:47:01 PC1 kernel: drbd0: Creating new current UUID May 15 17:47:01 PC1 kernel: drbd0: asender terminated May 15 17:47:01 PC1 kernel: drbd0: short read expecting header on sock: r=-512 May 15 17:47:01 PC1 kernel: drbd0: tl_clear() May 15 17:47:01 PC1 kernel: drbd0: Connection closed May 15 17:47:01 PC1 kernel: drbd0: Writing meta data super block now. May 15 17:47:01 PC1 kernel: drbd0: conn( NetworkFailure -> Unconnected ) May 15 17:47:01 PC1 kernel: drbd0: receiver terminated May 15 17:47:01 PC1 kernel: drbd0: receiver (re)started May 15 17:47:01 PC1 kernel: drbd0: conn( Unconnected -> WFConnection ) service network start ===================== No response in messages from DRBD. Dump of drbdadm dump ==================== # /etc/drbd.conf common { syncer { rate 10M; } } resource drbd0 { protocol C; on PC1 { device /dev/drbd0; disk /dev/mapper/vgroot-LogVol01; address 192.168.13.110:7788; meta-disk internal; } on PC2 { device /dev/drbd0; disk /dev/mapper/vgroot-LogVol01; address 192.168.13.222:7788; meta-disk internal; } net { timeout 60; connect-int 10; ping-int 10; ping-timeout 5; max-buffers 2048; max-epoch-size 2048; ko-count 2; after-sb-0pri discard-least-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; rr-conflict disconnect; } disk { on-io-error detach; } syncer { rate 10M; after drbd1; al-extents 257; } startup { wfc-timeout 15; degr-wfc-timeout 15; } handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; local-io-error "echo O > /proc/sysrq-trigger ; halt -f"; } } resource drbd1 { protocol C; on PC1 { device /dev/drbd1; disk /dev/mapper/vgroot-LogVol00; address 192.168.13.110:7789; meta-disk internal; } on PC2 { device /dev/drbd1; disk /dev/mapper/vgroot-LogVol00; address 192.168.13.222:7789; meta-disk internal; } net { timeout 60; connect-int 10; ping-int 10; ping-timeout 5; max-buffers 2048; max-epoch-size 2048; after-sb-0pri discard-least-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; rr-conflict disconnect; ko-count 2; } disk { on-io-error detach; } syncer { rate 10M; al-extents 257; } startup { wfc-timeout 15; degr-wfc-timeout 15; } } May Day! --Regards S.Balaji