Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Jake, Looks like that it is that bug... I've patched & recompiled DRDB and did some quick tests. Seems to work as it should now. Will do some more extensive tests next week and will report back. Thanks a lot, I've breaking my head over this one quite a while... Cheers, Dirk Op 3-8-2012 16:09, Jake Smith schreef: > ----- Original Message ----- >> From: "Dirk Bonenkamp - ProActive" <dirk at proactive.nl> >> To: drbd-user at lists.linbit.com >> Sent: Friday, August 3, 2012 4:17:46 AM >> Subject: Re: [DRBD-user] crm-fence-peer.sh & maintenance / reboots >> >> Hi all, >> >> I'm still struggling with this problem. Since my last mail, I've >> simplified my setup: 1 DRBD resource with only 1 file system >> resource. I >> normally have stonith in place & working, but this is also removed >> for >> simplicity. >> >> Things that work as expected: >> - Pulling the dedicated drdb network cable. Location constraint is >> created as expected (preventing promotion of the now unconnected >> slave >> node). The constraint gets removed after re-plugging the cable. >> - Rebooting the slave node / putting the slave node in stanby mode. >> No >> constraints (as expected), no problems. >> - Migrating the file system resource. File system unmounts, slave >> node >> becomes master, file system mounts, no problems. >> >> Things that do not work as expected: >> - Rebooting the master node / putting the master node in standby >> mode. >> The location constraint is created, which prevents the slave becoming >> master... To correct this, I have to put the old master node on-line >> again and have to remove the constraint by hand. >> >> My setup: >> Ubuntu 10.04 running 2.6.32-41-generic / x86_64 >> DRBD 8.3.13 (self compiled) > Hi Dirk! > > Might be this bug affecting fencing I found when using the -41 kernel in Ubuntu with DRBD 8.3.13: > > https://bugs.launchpad.net/ubuntu/+source/drbd8/+bug/1000355 > >> Pacemaker 1.1.6 (from HA maintainers PPA) >> Corosync 1.4.2 (from HA maintainers PPA) >> >> Network: >> 10.0.0.0/24 on eth0: network for 'normal' connectivity >> 172.16.0.1 <-> 172.16.0.2 on eth1: dedicated network for DRBD >> >> corosync-cfgtool -s output: >> >> Printing ring status. >> Local node ID 16781484 >> RING ID 0 >> id = 172.16.0.1 >> status = ring 0 active with no faults >> RING ID 1 >> id = 10.0.0.71 >> status = ring 1 active with no faults > Look here for a second step required to verify corosync rings are actually OK when it's only a two node cluster: > http://www.hastexo.com/resources/hints-and-kinks/checking-corosync-cluster-membership > >> Configuration files: >> http://pastebin.com/VUgHcuQ0 >> >> Log of a failed failover (master node): >> http://pastebin.com/f5amFMzY >> >> Log of a failed failover (slave node): >> http://pastebin.com/QHBPnHFQ > How about output of /proc/drbd and crm configure show when master node in standby? > > HTH > Jake > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user