[DRBD-user] dopd breaks bonding on drbd connection

Chad Phillips -- Apartment Lines chad at apartmentlines.com
Mon Sep 15 14:54:33 CEST 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


i originally posted this on the linux-ha list, but received no replies.
i was hoping somebody here might be able to shed some light on
the issue:

i've run into a very weird problem involving heartbeat, dopd, drbd,
and the linux bonding driver. first, my setup:

OS: centos 5.2 running kernel 2.6.18-92.1.10.el5PAE
heartbeat: 2.1.3 w/ dopd patch applied
drbd: 8.2.6 for installed kernel

the 'w/ dopd patch applied' refers to this patch that fixes a critical
breakage in dopd:
http://hg.linux-ha.org/dev/rev/47f60bebe7b2

note that the problem i describe below happens with or without the
patch.

basically, when i turn on dopd in my ha.cf file with the following
directives:
respawn hacluster /usr/lib/heartbeat/dopd
apiauth dopd gid=haclient uid=hacluster

the bonding setup that i have for my drbd connection tanks with
messages like these:
Aug 31 19:53:25 beast kernel: bonding: bond1: link status definitely
up for interface eth3.
Aug 31 19:53:25 beast kernel: bonding: bond1: making interface eth3
the new active one.
Aug 31 19:53:25 beast kernel: bonding: bond1: first active interface up!
Aug 31 19:53:25 beast kernel: bonding: bond1: link status definitely
up for interface eth4.
Aug 31 19:53:25 beast kernel: bonding: bond1: link status definitely
up for interface eth5.
Aug 31 20:18:27 beast kernel: bonding: bond1: link status definitely
down for interface eth5, disabling it
Aug 31 20:18:28 beast kernel: bonding: bond1: link status definitely
down for interface eth3, disabling it
Aug 31 20:18:28 beast kernel: bonding: bond1: link status definitely
down for interface eth4, disabling it
Aug 31 20:18:28 beast kernel: bonding: bond1: now running without any
active interface !
Aug 31 20:18:28 beast kernel: bonding: bond1: Error: found a client
with no channel in the client's hash table

the bond has three slave interfaces, using three crossover cables
that run directly from one machine to the other. the bond works
perfectly with heartbeat and drbd _until_ i add those directives
listed above to turn on dopd, at which point it tanks with those error
messages.

there's nothing special or interesting in the heartbeat logs that i
can see -- when dopd gets started by heartbeat then the problem occurs.

in the meantime i've simply gotten rid of the bond, and heartbeat/dopd/
drbd are all running happily together. i sure would like to have my
cake and eat it, too, though -- so if anybody has any suggestions
about how i can fix this, i'd surely appreciate it!



More information about the drbd-user mailing list