[DRBD-user] drbd resource fencing - 2nd try with more information

Dominik Klein dk at in-telegence.net
Wed Dec 5 14:45:26 CET 2007

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


> It would be nice to know how to build dopd and the drbd-peer-outdater 
> from a certain drbd version with a recent heartbeat version. The 
> instructions in the README file (in the drbd tools/ subdirectory) bring 
> up the errors I posted earlier.
> 
> dopd.c from Andrews recent code (2.1.*2*-24) is different from 
> drbd-8.0.7's code and both are different from linux-ha.org's 2.1.2 release.
> So these questions come up:
> Are dopd and drbd-peer-outdater maintained in heartbeat or in drbd?
> Which version should we consider "recent"?
> Are dopd and drbd-peer-outdater bound to one heartbeat and/or drbd version?

Here's some information about this from Andrew Beekhof. He is not 
subscribed here and was okay with me posting this information here:

---
dopd is maintained exclusively by linbit - if its broken, then it is up
to them to fix it.

dopd shouldn't be using anything from  lib/crm or include/crm

I asked them a year or so to change this but nothing happened.  In the 
end I got fed up and removed the references myself.
---

That should be why the version supplied with heartbeat compiles and the 
version supplied with drbd doesn't.

Again: Here's the error message I refer to:
dopd.c: In function ‘main’:
dopd.c:490: warning: implicit declaration of function ‘ha_strdup’
dopd.c:490: warning: assignment makes pointer from integer without a cast
dopd.c:491: error: too few arguments to function ‘crm_log_init’
dopd.c:556: warning: passing argument 1 of ‘init_server_ipc_comms’ makes
pointer from integer without a cast
make[1]: *** [dopd-dopd.o] Error 1
make[1]: Leaving directory `/root/src/CRM-Devel-obs-2.1.2-24/tools'
make: *** [all-recursive] Error 1

It does however not explain the "bad magic number" errors and I am far 
from understanding what that really is about.

Andrew Beekhof stated:
---
magic number problems usually occur when there is a mixture of 
cl_free/cl_malloc^ and free/malloc
the interim builds are always built with --enable-libc-malloc, try 
passing that to ConfigureMe and rebuilding
---

I did that, too. The bad magic number error is gone, but the resource is 
not outdated either:

Here's the log from the Primary node:
Dec  5 14:42:11 dktest1debian kernel: drbd2: PingAck did not arrive in time.
Dec  5 14:42:11 dktest1debian kernel: drbd2: peer( Secondary -> Unknown 
) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Dec  5 14:42:11 dktest1debian kernel: drbd2: Creating new current UUID
Dec  5 14:42:11 dktest1debian kernel: drbd2: asender terminated
Dec  5 14:42:11 dktest1debian kernel: drbd2: short read expecting header 
on sock: r=-512
Dec  5 14:42:11 dktest1debian kernel: drbd2: tl_clear()
Dec  5 14:42:11 dktest1debian kernel: drbd2: Connection closed
Dec  5 14:42:11 dktest1debian kernel: drbd2: Writing meta data super 
block now.
Dec  5 14:42:11 dktest1debian kernel: drbd2: helper command: 
/sbin/drbdadm outdate-peer
Dec  5 14:42:11 dktest1debian drbd-peer-outdater: [13656]: debug: drbd 
peer: dktest2debian
Dec  5 14:42:11 dktest1debian drbd-peer-outdater: [13656]: debug: drbd 
resource: drbd2
Dec  5 14:42:11 dktest1debian kernel: drbd2: State change failed: 
Refusing to be Primary without at least one UpToDate disk
Dec  5 14:42:11 dktest1debian kernel: drbd2:   state = { 
cs:NetworkFailure st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  5 14:42:11 dktest1debian kernel: drbd2:  wanted = { 
cs:NetworkFailure st:Primary/Unknown ds:Outdated/DUnknown r--- }
Dec  5 14:42:11 dktest1debian kernel: drbd2: outdate-peer helper broken, 
returned 255
Dec  5 14:42:11 dktest1debian kernel: drbd2: Forcing state change from 
bad state. Error would be: 'Refusing to be Primary while peer is not 
outdated'
Dec  5 14:42:11 dktest1debian kernel: drbd2:  old = { cs:NetworkFailure 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  5 14:42:11 dktest1debian kernel: drbd2:  new = { cs:Unconnected 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  5 14:42:11 dktest1debian kernel: drbd2: conn( NetworkFailure -> 
Unconnected )
Dec  5 14:42:11 dktest1debian kernel: drbd2: receiver terminated
Dec  5 14:42:11 dktest1debian kernel: drbd2: receiver (re)started
Dec  5 14:42:11 dktest1debian /usr/lib/heartbeat/dopd: [13640]: debug: 
Connecting channel
Dec  5 14:42:11 dktest1debian /usr/lib/heartbeat/dopd: [13640]: debug: 
Client outdater (0x80508e0) connected
Dec  5 14:42:11 dktest1debian /usr/lib/heartbeat/dopd: [13640]: debug: 
invoked: outdater
Dec  5 14:42:11 dktest1debian kernel: drbd2: Forcing state change from 
bad state. Error would be: 'Refusing to be Primary while peer is not 
outdated'
Dec  5 14:42:11 dktest1debian kernel: drbd2:  old = { cs:Unconnected 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  5 14:42:11 dktest1debian kernel: drbd2:  new = { cs:WFConnection 
st:Primary/Unknown ds:UpToDate/DUnknown r--- }
Dec  5 14:42:11 dktest1debian kernel: drbd2: conn( Unconnected -> 
WFConnection )
Dec  5 14:42:11 dktest1debian /usr/lib/heartbeat/dopd: [13640]: debug: 
Processed 0 messages
Dec  5 14:42:11 dktest1debian /usr/lib/heartbeat/dopd: [13640]: debug: 
Deleting outdater (0x80508e0) from mainloop
Dec  5 14:42:11 dktest1debian /usr/lib/heartbeat/dopd: [13640]: debug: 
connection from client closed

The secondary does not say anything  but:
Dec  5 14:42:11 dktest2debian kernel: drbd2: PingAck did not arrive in time.
Dec  5 14:42:11 dktest2debian kernel: drbd2: peer( Primary -> Unknown ) 
conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )
Dec  5 14:42:11 dktest2debian kernel: drbd2: asender terminated
Dec  5 14:42:11 dktest2debian kernel: drbd2: short read expecting header 
on sock: r=-512
Dec  5 14:42:11 dktest2debian kernel: drbd2: tl_clear()
Dec  5 14:42:11 dktest2debian kernel: drbd2: Connection closed
Dec  5 14:42:11 dktest2debian kernel: drbd2: Writing meta data super 
block now.
Dec  5 14:42:11 dktest2debian kernel: drbd2: conn( NetworkFailure -> 
Unconnected )
Dec  5 14:42:11 dktest2debian kernel: drbd2: receiver terminated
Dec  5 14:42:11 dktest2debian kernel: drbd2: receiver (re)started
Dec  5 14:42:11 dktest2debian kernel: drbd2: conn( Unconnected -> 
WFConnection )

Regards
Dominik



More information about the drbd-user mailing list