Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi all,
I'm still struggling with this problem. Since my last mail, I've
simplified my setup: 1 DRBD resource with only 1 file system resource. I
normally have stonith in place & working, but this is also removed for
simplicity.
Things that work as expected:
- Pulling the dedicated drdb network cable. Location constraint is
created as expected (preventing promotion of the now unconnected slave
node). The constraint gets removed after re-plugging the cable.
- Rebooting the slave node / putting the slave node in stanby mode. No
constraints (as expected), no problems.
- Migrating the file system resource. File system unmounts, slave node
becomes master, file system mounts, no problems.
Things that do not work as expected:
- Rebooting the master node / putting the master node in standby mode.
The location constraint is created, which prevents the slave becoming
master... To correct this, I have to put the old master node on-line
again and have to remove the constraint by hand.
My setup:
Ubuntu 10.04 running 2.6.32-41-generic / x86_64
DRBD 8.3.13 (self compiled)
Pacemaker 1.1.6 (from HA maintainers PPA)
Corosync 1.4.2 (from HA maintainers PPA)
Network:
10.0.0.0/24 on eth0: network for 'normal' connectivity
172.16.0.1 <-> 172.16.0.2 on eth1: dedicated network for DRBD
corosync-cfgtool -s output:
Printing ring status.
Local node ID 16781484
RING ID 0
id = 172.16.0.1
status = ring 0 active with no faults
RING ID 1
id = 10.0.0.71
status = ring 1 active with no faults
Configuration files:
http://pastebin.com/VUgHcuQ0
Log of a failed failover (master node):
http://pastebin.com/f5amFMzY
Log of a failed failover (slave node):
http://pastebin.com/QHBPnHFQ
I hope somebody can shed some light on this for me...
Thank you in advance, kind regards,
Dirk