Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Dec 04, 2007 at 10:48:43AM +0100, Dominik Klein wrote:
> Hi Florian, drbd-users
>
> I see I have been very short on info here. Sorry for that.
>
> So I want to learn about resource fencing in DRBD. I read the recent
> thread about it and read about the different modes DRBD offers for
> fencing. As I dont have a STONITH device, I went for resource-only.
>
> Here's my configuration, what I did and what I got.
>
> Nodes: dktest1debian, dktest2debian
> OS: Debian Etch 32 bit
> DRBD: 8.0.7
> Heartbeat: 2.1.12-24
> Kernel 2.6.18-4-686
> Network: eth0 10.250.250.0/24 for drbd and heartbeat
> eth1 10.2.50.0/24 for normal networking and heartbeat
>
> ha.cf:
> keepalive 2
> deadtime 30
> warntime 10
> ucast eth1 10.2.50.100
> ucast eth0 10.250.250.100
> node dktest1debian
> node dktest2debian
> ping 10.2.50.32
> ping 10.2.50.2
> ping 10.2.50.34
> ping 10.2.50.250
> ping 10.2.50.11
> respawn root /usr/lib/heartbeat/pingd -p /var/run/pingd.pid -d 5s -m 100
> respawn hacluster /usr/lib/heartbeat/dopd
> apiauth dopd gid=haclient uid=hacluster
> use_logd yes
> crm on
>
> drbd.conf:
> global {
> usage-count no;
> }
> common {
> handlers {
> outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater";
> }
> }
> resource drbd2 {
> protocol C;
> startup {
> wfc-timeout 15;
> degr-wfc-timeout 120;
> }
> disk {
> on-io-error detach;
> fencing resource-only;
> }
> net {
> after-sb-0pri disconnect;
> after-sb-1pri disconnect;
> after-sb-2pri disconnect;
> rr-conflict disconnect;
> max-buffers 20480;
> max-epoch-size 16384;
> unplug-watermark 20480;
> }
> syncer {
> rate 140M;
> }
> on dktest1debian {
> device /dev/drbd2;
> disk /dev/sda3;
> address 10.250.250.100:7790;
> meta-disk internal;
> }
> on dktest2debian {
> device /dev/drbd2;
> disk /dev/sda3;
> address 10.250.250.101:7790;
> meta-disk internal;
> }
> }
>
> Now I do:
> reboot both nodes
> rm /var/lib/heartbeat/crm/* on both nodes
>
> So we start off real clean.
>
> /etc/init.d/heartbeat start on both nodes
>
> Wait to see online/online and that a DC has been chosen, dopd is started.
>
> At this point, I have no resources configured and Linux-HA is running
> with all defaults (no STONITH).
>
> Now I promote drbd2 on dktest1debian.
>
> After that I unplug the DRBD link (eth0)
>
> Then in the logs I see:
>
> Dec 4 10:27:39 dktest1debian drbd-peer-outdater: [2674]: debug: drbd
> peer: dktest2debian
> Dec 4 10:27:39 dktest1debian drbd-peer-outdater: [2674]: debug: drbd
> resource: drbd2
> Dec 4 10:27:39 dktest1debian drbd-peer-outdater: [2674]: ERROR:
> cl_free: Bad magic number in object at 0xbfc405e8
this is comming from the heartbeat messaging layer. something in there
is broken, or you somehow got a broken build, or, most likely, the
versions of the "drbd-outdate-peer" and the "dopd" are not matching.
> What does this mean (bad magic number)?
something in your heartbeat communication channels and the way the
"dopd" works with them is screwed up.
as long as I cannot reproduce this, I cannot help you.
is there anything "special" about your architechtures and distributions?
--
: commercial DRBD/HA support and consulting: sales at linbit.com :
: Lars Ellenberg Tel +43-1-8178292-0 :
: LINBIT Information Technologies GmbH Fax +43-1-8178292-82 :
: Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com :
__
please use the "List-Reply" function of your email client.