Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Dec 04, 2007 at 10:48:43AM +0100, Dominik Klein wrote: > Hi Florian, drbd-users > > I see I have been very short on info here. Sorry for that. > > So I want to learn about resource fencing in DRBD. I read the recent > thread about it and read about the different modes DRBD offers for > fencing. As I dont have a STONITH device, I went for resource-only. > > Here's my configuration, what I did and what I got. > > Nodes: dktest1debian, dktest2debian > OS: Debian Etch 32 bit > DRBD: 8.0.7 > Heartbeat: 2.1.12-24 > Kernel 2.6.18-4-686 > Network: eth0 10.250.250.0/24 for drbd and heartbeat > eth1 10.2.50.0/24 for normal networking and heartbeat > > ha.cf: > keepalive 2 > deadtime 30 > warntime 10 > ucast eth1 10.2.50.100 > ucast eth0 10.250.250.100 > node dktest1debian > node dktest2debian > ping 10.2.50.32 > ping 10.2.50.2 > ping 10.2.50.34 > ping 10.2.50.250 > ping 10.2.50.11 > respawn root /usr/lib/heartbeat/pingd -p /var/run/pingd.pid -d 5s -m 100 > respawn hacluster /usr/lib/heartbeat/dopd > apiauth dopd gid=haclient uid=hacluster > use_logd yes > crm on > > drbd.conf: > global { > usage-count no; > } > common { > handlers { > outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater"; > } > } > resource drbd2 { > protocol C; > startup { > wfc-timeout 15; > degr-wfc-timeout 120; > } > disk { > on-io-error detach; > fencing resource-only; > } > net { > after-sb-0pri disconnect; > after-sb-1pri disconnect; > after-sb-2pri disconnect; > rr-conflict disconnect; > max-buffers 20480; > max-epoch-size 16384; > unplug-watermark 20480; > } > syncer { > rate 140M; > } > on dktest1debian { > device /dev/drbd2; > disk /dev/sda3; > address 10.250.250.100:7790; > meta-disk internal; > } > on dktest2debian { > device /dev/drbd2; > disk /dev/sda3; > address 10.250.250.101:7790; > meta-disk internal; > } > } > > Now I do: > reboot both nodes > rm /var/lib/heartbeat/crm/* on both nodes > > So we start off real clean. > > /etc/init.d/heartbeat start on both nodes > > Wait to see online/online and that a DC has been chosen, dopd is started. > > At this point, I have no resources configured and Linux-HA is running > with all defaults (no STONITH). > > Now I promote drbd2 on dktest1debian. > > After that I unplug the DRBD link (eth0) > > Then in the logs I see: > > Dec 4 10:27:39 dktest1debian drbd-peer-outdater: [2674]: debug: drbd > peer: dktest2debian > Dec 4 10:27:39 dktest1debian drbd-peer-outdater: [2674]: debug: drbd > resource: drbd2 > Dec 4 10:27:39 dktest1debian drbd-peer-outdater: [2674]: ERROR: > cl_free: Bad magic number in object at 0xbfc405e8 this is comming from the heartbeat messaging layer. something in there is broken, or you somehow got a broken build, or, most likely, the versions of the "drbd-outdate-peer" and the "dopd" are not matching. > What does this mean (bad magic number)? something in your heartbeat communication channels and the way the "dopd" works with them is screwed up. as long as I cannot reproduce this, I cannot help you. is there anything "special" about your architechtures and distributions? -- : commercial DRBD/HA support and consulting: sales at linbit.com : : Lars Ellenberg Tel +43-1-8178292-0 : : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : : Vivenotgasse 48, A-1120 Vienna/Europe http://www.linbit.com : __ please use the "List-Reply" function of your email client.