[DRBD-user] How to become primary when the other node gets dead?

Richard NAGY r.nagy at nameshield.net
Wed Oct 13 16:23:49 CEST 2004

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hello all,

I'm facing a problem with drbd and heartbeat. When the master node is 
powered off, the secondary node stays in a secondary state. It does not 
want to become primary. It says that the other pear is primary, but it 
is dead!

I'm using drbd (0.7.2 from source) and heartbeat (1.2.0 from RPM) works 
well on a 2.6.8.1 kernel (source) on a Mandrake linux 10.0. They work 
well manually (heartbeat restart) or with the hb_standby command : 
services go well to the other node (2 nodes cluster). DRBD works well in 
such a context. But, when the power goes off on the primary node, the 
drbddisk script, launched by heartbeat to switch the node from secondary 
to primary state, can not switch DRBD to the primary state. The survival 
node stays in the secondary state. So, my data partition cannot be 
mounted. So, some services which rely on that partition cannot get up.

It seems that the remainning node "thinks" that the primary node (which 
is dead!) is still in primary state and so, does not want to become 
primary. Please, tell me a bit about this.
So, when a node crashes completely, how the other node be able to become 
primary and so, be able to mount my partition in a read-write mode?

Thanks for a bit help.

-------------- snip ha-debug of the remaining node
heartbeat: 2004/10/13_15:45:43 debug: Starting /etc/rc.d/init.d/drbddisk 
r0 start
ioctl(,SET_STATE,) failed: Permission denied
Partner is already primary
Command '/sbin/drbdsetup /dev/drbd/0 primary' terminated with exit code 20
drbdadm aborting
heartbeat: 2004/10/13_15:45:43 debug: /etc/rc.d/init.d/drbddisk r0 start 
done. RC=0
--------------- snip

--------------- snip ha-log of the remaining node
heartbeat: 2004/10/13_15:45:43 WARN: node truc1: is dead
heartbeat: 2004/10/13_15:45:43 WARN: No STONITH device configured.
heartbeat: 2004/10/13_15:45:43 WARN: Shared disks are not protected.
heartbeat: 2004/10/13_15:45:43 info: Resources being acquired from truc1.
heartbeat: 2004/10/13_15:45:43 info: Link truc1:eth1 dead.
heartbeat: 2004/10/13_15:45:43 info: Running /etc/ha.d/rc.d/status status
heartbeat: 2004/10/13_15:45:43 info: No local resources 
[/usr/lib/heartbeat/ResourceManager listkeys truc2] to acquire.
heartbeat: 2004/10/13_15:45:43 info: Taking over resource group drbddisk::r0
heartbeat: 2004/10/13_15:45:43 info: Acquiring resource group: truc1 
drbddisk::r0 Filesystem::/dev/drbd/0::/data::ext3 192.168.2.211 named 
inst_dns_server
heartbeat: 2004/10/13_15:45:43 info: Running /etc/rc.d/init.d/drbddisk 
r0 start
heartbeat: 2004/10/13_15:45:43 info: Running 
/etc/ha.d/resource.d/Filesystem /dev/drbd/0 /data ext3 start
heartbeat: 2004/10/13_15:45:43 ERROR: Couldn't mount filesystem 
/dev/drbd/0 on /data
heartbeat: 2004/10/13_15:45:43 ERROR: Return code 1 from 
/etc/ha.d/resource.d/Filesystem
heartbeat: 2004/10/13_15:45:43 info: Running /etc/ha.d/resource.d/IPaddr 
192.168.2.211 start
heartbeat: 2004/10/13_15:45:44 info: /sbin/ifconfig eth0:0 192.168.2.211 
netmask 255.255.255.0  broadcast 192.168.2.255
heartbeat: 2004/10/13_15:45:44 info: Sending Gratuitous Arp for 
192.168.2.211 on eth0:0 [eth0]
heartbeat: 2004/10/13_15:45:44 /usr/lib/heartbeat/send_arp -i 500 -r 10 
-p /var/lib/heartbeat/rsctmp/send_arp/send_arp-192.168.2.211 eth0 
192.168.2.211 auto 192.168.2.211 ffffffffffff
-------------------- snip



-- 
*****************************
Richard NAGY
Nameshield
*****************************




More information about the drbd-user mailing list