[DRBD-user] DRBD ping-timeout values

George H george.dma at gmail.com
Fri Apr 4 10:26:34 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

I'm using DRBD v8.0.8 with kernel 2.6.23 and I am doing an initial
sync of around 500GB.
It continuously fails with "PingAck did not arrive in time." after it
reaches 2% of progress. I re-ran the sync and separately pinged the
other node the whole time. The ping replys were all at 0.200 ms.

Below are the DRBD logs. Basically it took around 45minute before a
ping ack wasn't received.

pr  4 09:02:05 mailserv1 drbd0: Peer authenticated using 32 bytes of
'sha256' HMAC
Apr  4 09:02:05 mailserv1 drbd0: conn( WFConnection -> WFReportParams )
Apr  4 09:02:05 mailserv1 drbd0: Becoming sync source due to disk states.
Apr  4 09:02:05 mailserv1 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) pdsk( Outdated -> Inconsistent )
Apr  4 09:02:09 mailserv1 drbd0: Writing meta data super block now.
Apr  4 09:08:16 mailserv1 drbd0: role( Secondary -> Primary )
Apr  4 09:08:16 mailserv1 drbd0: Writing meta data super block now.
Apr  4 09:08:18 mailserv1 drbd0: conn( WFBitMapS -> SyncSource )
Apr  4 09:08:18 mailserv1 drbd0: Began resync as SyncSource (will sync
453330132 KB [113332533 bits set]).
Apr  4 09:08:18 mailserv1 drbd0: Writing meta data super block now.
Apr  4 09:53:56 mailserv1 drbd0: PingAck did not arrive in time.
Apr  4 09:53:56 mailserv1 drbd0: peer( Secondary -> Unknown ) conn(
SyncSource -> NetworkFailure )
Apr  4 09:53:56 mailserv1 drbd0: asender terminated
Apr  4 09:53:56 mailserv1 drbd0: drbd_pp_alloc interrupted!
Apr  4 09:53:56 mailserv1 drbd0: alloc_ee: Allocation of a page failed
Apr  4 09:53:56 mailserv1 drbd0: error receiving RSDataRequest, l: 24!
Apr  4 09:53:56 mailserv1 drbd0: tl_clear()
Apr  4 09:53:56 mailserv1 drbd0: Connection closed
Apr  4 09:53:56 mailserv1 drbd0: Writing meta data super block now.
Apr  4 09:53:56 mailserv1 drbd0: conn( NetworkFailure -> Unconnected )
Apr  4 09:53:56 mailserv1 drbd0: receiver terminated
Apr  4 09:53:56 mailserv1 drbd0: receiver (re)started
Apr  4 09:53:56 mailserv1 drbd0: conn( Unconnected -> WFConnection )
Apr  4 09:53:59 mailserv1 drbd0: Handshake successful: DRBD Network
Protocol version 86
Apr  4 09:53:59 mailserv1 drbd0: Peer authenticated using 32 bytes of
'sha256' HMAC
Apr  4 09:53:59 mailserv1 drbd0: conn( WFConnection -> WFReportParams )
Apr  4 09:53:59 mailserv1 drbd0: Becoming sync source due to disk states.
Apr  4 09:53:59 mailserv1 drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS )
Apr  4 09:54:03 mailserv1 drbd0: Writing meta data super block now.
Apr  4 10:00:14 mailserv1 drbd0: conn( WFBitMapS -> SyncSource )
Apr  4 10:00:14 mailserv1 drbd0: Began resync as SyncSource (will sync
445001204 KB [111250301 bits set]).
Apr  4 10:00:14 mailserv1 drbd0: Writing meta data super block now.

I am using default values for ping-int, pint-timeout, which are   10
and 500 (respectively). To me this looks like the DRBD software is
lagging in replying to the pingAck , am I right on this? if I increase
the ping-timeout to something bigger like 1000 or 2000 will it solve
this problem?

My eth0 setting are (ethtool outhout)

Settings for eth0:
        Supported ports: [ FIBRE ]
        Supported link modes:   1000baseT/Full
        Supports auto-negotiation: Yes
        Advertised link modes:  1000baseT/Full
        Advertised auto-negotiation: Yes
        Speed: 1000Mb/s
        Duplex: Full
        Port: FIBRE
        PHYAD: 2
        Transceiver: internal
        Auto-negotiation: on
        Supports Wake-on: d
        Wake-on: d
        Link detected: yes

Thanks.



More information about the drbd-user mailing list