Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello. I'm reading tons of information about pacemaker/corosyc/drbd and fence devices and right now I'm feeling really lost. My enviroment (2 nodes cluster-corosync/pacemaker for kvm virtualization using drbd as shared storage) doesn't have a true fence device (ipmi, apc or any other physical device/hardware). In fact I could use my managed switch but this method would cut only the node's external link (the one where the services/vms will be avaliable) and not the drbd's internal link (for this connection I'm using a direct crossover cable). And thats it! That is all I got for a production environment. So... how to do the fencing/stonith? Should I use the managed switch? Is it enough? In fact I belive to be facing a lot of limitations on my understanding about fencing/stonith. In case I lost my external link in one node I might be able to test this scenario using "ping" agent and migrate the services for the other. As far I can understand this is not fencing/stonish but just a failover/migration right? In case I lost external link in both nodes then then "ping" agent should be smart to do nothing! In case I lost the internal link that will result in a lost of drbd replication at once. In this case the primary's resource should be able to run on it's actual node with no problem. After drbd's link is up again a resync will start and both nodes will be uptodate in the end of the process. Again this is not a fencing/stonith situation, right? So lets see some fencing/stonish situation that I believe are possible: - one node crashs (power off, hangs, etc): in this situation node "A" (as example) is crashed, node "B" detects the broken communication and should start a stonith and only after a stonith's response it is free to restart stopped services that were prior allocated to node "A"; ---> in this scenario my cluster will end with a primary/primary race condition and a split-brain anyway, would not it? I meant node "A" was the primary when crashed... node "B" took the primary and restart service but when node "A" come back it will try to be primary again or will accept be secondary and start a resync? - both nodes cash: no idea on what should happen... if in case only network failed so both nodes will try a STONITH... when the communication is back the stonith will take place and I might see at least one reboot on each node... in this case my "ping" smart agent should do nothing because it could not confirm the other's node link! - node "B" got no external link (so my "ping" smart agent starts to migrate services to node "A") but then node "A" crashes... ohhh my blessed God.... I have no idea. At this point of my (lack) understanding I would use the following parameters for drbd.conf global { usage-count yes; } common { protocol C; handlers { fence-peer "/usr/lib/drbd/crm-fence-peer.sh"; after-resync-target "/usr/lib/drbd/crm-unfence-peer.sh"; pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f"; } disk { on-io-error detach; fencing resource-only; } net { data-integrity-alg md5; after-sb-0pri discard-zero-changes; after-sb-1pri discard-secondary; after-sb-2pri disconnect; } syncer { rate 42M; al-extents 3389; verify-alg md5; csums-alg md5; } } Is it enough or do I need enable stonith on pacemaker? If I need stonith may I use stonith:null method on my cluster and keep all may faith on drbd fence scripts? As alternative of stonith:null I was thinking about: - suicide as stonith device (I have this on my opensuse 12.3 openais/pacemaker/corosync box) but I read a lot of people recommending not doing so. - "sbd".... but this software based fence device should not be used over a drbd resource. But to make me crazier the same guide gives me hope again telling that I might be able to use iSCSI over drbd for sbd fencing (or perhaps thats a miss translation)! Below the exact words. "The SBD devices must not reside on a DRBD instance. Why? Because DRBD is not shared, but replicated storage. If your cluster communication breaks down, and you finally need to actually use stonith, chances are that the DRBD replication link broke down as well, whatever you write to your local instance of DRBD cannot reach the peer, and the peer's sbd daemon has no way to know about that poison pill it is supposed to commit suicide uppon. The SBD device may of course be an iSCSI LU, which in turn may be exported from a DRBD based iSCSI target cluster." Is iSCSI over drbd a real possibility for sbd fencing? If you have recommendations or additional comments please feel free to send. Regards, Gilsberty P.S.: I'm sorry for the size of the message.