[DRBD-user] Unexpected behaviour with respect to authenticating peers

Bas van Schaik bas at tuxes.nl
Sun Jun 22 15:01:48 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi list,

I'm using DRBD for some nifty offsite-backup configuration, of course
using simple authentication. Once in a while the link between the two
servers fails for a few hours, there is nothing I can do about that.
However, when the connection is available again, DRBD fails to
reconnect. After a while I tried a manual "disconnect" and "connect" on
one of the servers, this yields the following error:
> Jun 22 14:18:34 mordor kernel: drbd1: Handshake successful: Agreed network protocol version 88
> Jun 22 14:18:34 mordor kernel: drbd1: sock was shut down by peer
> Jun 22 14:18:34 mordor kernel: drbd1: conn( WFConnection -> BrokenPipe )
> Jun 22 14:18:34 mordor kernel: drbd1: short read expecting header on sock: r=0
> Jun 22 14:18:34 mordor kernel: drbd1: Authentication of peer failed
> Jun 22 14:18:34 mordor kernel: drbd1: Discarding network configuration.
> Jun 22 14:18:34 mordor kernel: drbd1: conn( BrokenPipe -> Disconnecting )
> Jun 22 14:18:34 mordor kernel: drbd1: tl_clear()
> Jun 22 14:18:34 mordor kernel: drbd1: Connection closed
> Jun 22 14:18:34 mordor kernel: drbd1: conn( Disconnecting -> StandAlone )
> Jun 22 14:18:34 mordor kernel: drbd1: receiver terminated
>   

a few seconds earlier on the other server:
> Jun 22 14:18:26 guust kernel: drbd1: sock_recvmsg returned
> -11                                                             
> Jun 22 14:18:26 guust kernel: drbd1: conn( WFConnection -> BrokenPipe
> )                                                    
> Jun 22 14:18:26 guust kernel: drbd1: short read expecting header on
> sock: r=-11                                            
> Jun 22 14:18:26 guust kernel: drbd1:
> tl_clear()                                                                          
>  
> Jun 22 14:18:26 guust kernel: drbd1: Connection
> closed                                                                   
>  
> Jun 22 14:18:26 guust kernel: drbd1: conn( BrokenPipe -> Unconnected
> )                                                     
> Jun 22 14:18:26 guust kernel: drbd1: conn( Unconnected -> WFConnection )

The initial "disconnect" and "connect" were executed on server "mordor"
and obviously didn't restore the DRBD connection. Only after having
performed the exact same procedure on server "guust" both servers
accepted authentication and the connection was restored:
> Jun 22 14:18:43 guust kernel: drbd1: conn( StandAlone -> Unconnected )
> Jun 22 14:18:43 guust kernel: drbd1: receiver (re)started
> Jun 22 14:18:43 guust kernel: drbd1: conn( Unconnected -> WFConnection )
> Jun 22 14:19:06 guust kernel: drbd1: Handshake successful: Agreed
> network protocol version 88
> Jun 22 14:19:06 guust kernel: drbd1: Peer authenticated using 32 bytes
> of 'sha256' HMAC
> Jun 22 14:19:06 guust kernel: drbd1: conn( WFConnection ->
> WFReportParams )
> Jun 22 14:19:06 guust kernel: drbd1: data-integrity-alg: <not-used>
> Jun 22 14:19:06 guust kernel: drbd1: peer( Unknown -> Primary ) conn(
> WFReportParams -> WFBitMapT ) pdsk( DUnknown -> UpToDate )
> Jun 22 14:19:06 guust kernel: drbd1: Writing meta data super block now.
> Jun 22 14:19:07 guust kernel: drbd1: conn( WFBitMapT -> WFSyncUUID )
> Jun 22 14:19:07 guust kernel: drbd1: conn( WFSyncUUID -> SyncTarget )
> disk( UpToDate -> Inconsistent )
> Jun 22 14:19:07 guust kernel: drbd1: Began resync as SyncTarget (will
> sync 5300936 KB [1325234 bits set]).
> Jun 22 14:19:07 guust kernel: drbd1: Writing meta data super block now.

Note that the DRBD configuration hasn't changed in the past few months,
so there didn't occur any configuration change or configuration reload
whatsoever.

DRBD version information:
> cat /proc/drbd
> version: 8.2.4 (api:88/proto:86-88)
> GIT-hash: fc00c6e00a1b6039bfcebe37afa3e7e28dbd92fa build by
> phil at mescal, 2008-01-11 13:40:26
(both servers running Debian Etch, kernel 2.6.18)

This problem has occurred a couple of times now, I suspect it to be the
result of a small bug in DRBD. Attached is my DRBD configuration file,
of course without the secrets.

Regards,

  Bas.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: drbd.conf
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080622/ff4ff310/attachment.txt>


More information about the drbd-user mailing list