[DRBD-user] drbd-0.7.0 left dead, dazed and confused after seeing -preX peer

Andreas Schultz aschultz at tpip.net
Thu Jul 22 14:30:07 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi Lars,

I should have explained it a bit more detailed ...

On Thursday 22 July 2004 14:13, Lars Ellenberg wrote:
> / 2004-07-22 13:42:49 +0200
>
> \ Andreas Schultz:
> > Hi,
> >
> > While upgrading my system, i have encountered a situation where both
> > peers are alive and well, but drbd can still not establish a connection
> > without removing and reinserting the drbd modules first.
> >
> > This happend after i rebooted the second system with an older kernel
> > which prompted the first system to terminate it's drbd receiver threads.
>
> that older kernel obviouly uses an older, incompatible, drbd (module)
> version.

Sure, it did.

[...]

> > ...  here all receivers are gone and drbd needs to reloaded to start
> > working again.
>
> are you sure? ... cat /proc/drbd ...

A 'ps -xaw | grep drbd' only shows the drbd_worker thread, all receivers are 
missing. A 'drbdadm connect' might have resolved the sitution, but i feel 
that i should automagicly recover once both peers are running the same 
version.
I would need to reboot the second peer again to reproduce the exact same 
situation to get the /proc/drbd output. Do you need it?

> it just goes into "StandAlone", because it recognized that the peer talks
> some incompatible drbd protocol version, and that won't change when we
> try a reconnect. So we still are "operational", i.e. you can access it
> locally. but we won't connect again until operator tells us (drbdadm
> connect) that the problem is resolved, typically by bringing the peers
> drbd up to date.

Thats what i did. I rebooted the second peer with a new kernel and it was then 
that the first peer did not reestablish the connection.

> > I have to admit that this is a situation that should not happen in
> > production, but i would also argue that _nothing_ should leave a drbd
> > peer in a state from which it can not recover automaticly.
>
> as the log says: peer runs incompatible version. so we can not talk to him.
> how do you think we can automatically recover from that?

Of course not when both peers are on different version, but once the second 
peer comes up with a compatible drbd version it should IMHO.

> btw, with 0.7.0, we introduced mentioned HandShake packet.
> from now on, drbd 0.7 (and probably all further versions) will be able
> to talk to drbd of protocol version (PRO_VERSION + [-1;0;1]), so you
> will be able to do a rolling upgrade of the cluster when we feel the
> need to change the protocal again somewhen.

nice.

Have fun

Andreas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: signature
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20040722/8c4d4083/attachment.pgp>


More information about the drbd-user mailing list