Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I have been having this really odd issue and I can't seem to figure it out. I have tried everything I can think of and I have compared it to all my other working DRBD setups and just cannot get this thing to work. node-1 is primary, /dev/drbd1 is mounted at /opt node-2 is secondary both are UpToDate shut down node-1, try to make node-2 primary and receive the error: 1: State change failed: (-7) Refusing to be Primary while peer is not outdated Command 'drbdsetup primary 1' terminated with exit code 11 Also check out this one as well: node-1 is primary, /dev/drbd1 is mounted at /opt node-2 is secondary both are UpToDate(same as before) This time, I shut down node-2(secondary). Everything is fine and continues to run normally on node-1. I unmount /dev/drbd1 and put it into secondary, and immediately put it back into primary: umount /dev/drbd1 drbdadm secondary all; drbdadm primary all # I ran these commands in one line so it switches as quick as possible. 1: State change failed: (-7) Refusing to be Primary while peer is not outdated Command 'drbdsetup primary 1' terminated with exit code 11 iptables is off, SELinux is off. I ran the drbdadm secondary and drbdadm primary in one line so it is as quick as possible. It was just running fine as a primary, so why can't I even make it a secondary, then make it primary again? Out of the 30+ times I have set this up, I have never encountered this problem. When either of the peers go offline, cat /proc/drbd shows: # cat /proc/drbd version: 8.4.4 (api:1/proto:86-101) GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil at Build64R6, 2013-10-14 15:33:06 1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----- ns:0 nr:0 dw:0 dr:664 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 If I restart DRBD and abort the timeout on the surviving node, it changes to this: # cat /proc/drbd version: 8.4.4 (api:1/proto:86-101) GIT-hash: 599f286440bd633d15d5ff985204aff4bccffadd build by phil at Build64R6, 2013-10-14 15:33:06 1: cs:WFConnection ro:Secondary/Unknown ds:Consistent/DUnknown C r----- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 Here is my config: ########## resource r0 { protocol C; net { cram-hmac-alg sha1; shared-secret "pazzwurd1"; max-epoch-size 512; sndbuf-size 0; } startup { wfc-timeout 30; outdated-wfc-timeout 20; degr-wfc-timeout 30; } disk { on-io-error detach; fencing resource-only; } syncer { rate 100M; } handlers { fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; } volume 0 { device /dev/drbd1; disk /dev/mapper/vg_ottppencrzdb1-lv_pgsql; meta-disk internal; } on db-node-1.myco.com { address 172.16.99.1:7789; } on db-node-2.myco.com { address 172.16.99.2:7789; } } ########## I have tried to remove the fencing handlers and it did not help. I haven't even gotten to the pacemaker stage yet anyways. I can send logs if needed, just tell me which ones you need. Thanks for any help. Mike