Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
OK I upgraded both my blades to the latest stable kernel 2.6.24. Rebuilt drbd 8.0.8 and restarted the sync. I noticed off hand the connection to get sync was quicker than before Apr 4 15:49:35 mailserv1 drbd0: conn( Connected -> WFBitMapS ) Apr 4 15:50:18 mailserv1 drbd0: conn( WFBitMapS -> SyncSource ) normally it used to take 5 or more minutes. But as that was quick.. so was the "network failure" see below Apr 4 15:49:35 mailserv1 drbd0: Writing meta data super block now. Apr 4 15:49:35 mailserv1 drbd0: Becoming sync source due to disk states. Apr 4 15:49:35 mailserv1 drbd0: Writing meta data super block now. Apr 4 15:49:35 mailserv1 drbd0: writing of bitmap took 7 jiffies Apr 4 15:49:35 mailserv1 drbd0: 476 GB (124997941 bits) marked out-of-sync by on disk bit-map. Apr 4 15:49:35 mailserv1 drbd0: Writing meta data super block now. Apr 4 15:49:35 mailserv1 drbd0: conn( Connected -> WFBitMapS ) Apr 4 15:50:18 mailserv1 drbd0: conn( WFBitMapS -> SyncSource ) Apr 4 15:50:18 mailserv1 drbd0: Began resync as SyncSource (will sync 499991764 KB [124997941 bits set]). Apr 4 15:50:18 mailserv1 drbd0: Writing meta data super block now. Apr 4 16:03:26 mailserv1 drbd0: PingAck did not arrive in time. Apr 4 16:03:26 mailserv1 drbd0: peer( Secondary -> Unknown ) conn( SyncSource -> NetworkFailure ) Apr 4 16:03:26 mailserv1 drbd0: asender terminated Apr 4 16:03:26 mailserv1 drbd0: drbd_pp_alloc interrupted! Apr 4 16:03:26 mailserv1 drbd0: alloc_ee: Allocation of a page failed Apr 4 16:03:26 mailserv1 drbd0: error receiving RSDataRequest, l: 24! Apr 4 16:03:26 mailserv1 drbd0: tl_clear() Apr 4 16:03:26 mailserv1 drbd0: Connection closed Apr 4 16:03:26 mailserv1 drbd0: Writing meta data super block now. Apr 4 16:03:26 mailserv1 drbd0: conn( NetworkFailure -> Unconnected ) Apr 4 16:03:26 mailserv1 drbd0: receiver terminated Apr 4 16:03:26 mailserv1 drbd0: receiver (re)started Apr 4 16:03:26 mailserv1 drbd0: conn( Unconnected -> WFConnection ) Apr 4 16:03:26 mailserv1 drbd0: Handshake successful: DRBD Network Protocol version 86 Apr 4 16:03:26 mailserv1 drbd0: Peer authenticated using 32 bytes of 'sha256' HMAC Apr 4 16:03:26 mailserv1 drbd0: conn( WFConnection -> WFReportParams ) Apr 4 16:03:26 mailserv1 drbd0: Becoming sync source due to disk states. Apr 4 16:03:26 mailserv1 drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) Apr 4 16:03:30 mailserv1 drbd0: Writing meta data super block now. Apr 4 16:04:08 mailserv1 drbd0: conn( WFBitMapS -> SyncSource ) Apr 4 16:04:08 mailserv1 drbd0: Began resync as SyncSource (will sync 497736404 KB [124434101 bits set]). Apr 4 16:04:08 mailserv1 drbd0: Writing meta data super block now. --- I don't know if this will help, but this is the 'dmesg | grep eth0' output eth0: Broadcom NetXtreme II BCM5708 1000Base-SX (B2) PCI-X 64-bit 133MHz found at mem dc000000, IRQ 17, node addr 00:1a:64:8c:91:e6 bnx2: eth0: using MSI ADDRCONF(NETDEV_UP): eth0: link is not ready bnx2: eth0 NIC SerDes Link is Up, 1000 Mbps full duplex ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready eth0: no IPv6 routers present I have "ping-timeout 20" and "rate 10M" in my drbd.conf file. Thanks