Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
----- Original Message ----- > From: "Digimer" <linux at alteeve.com> > To: "William Seligman" <seligman at nevis.columbia.edu> > Cc: drbd-user at lists.linbit.com > Sent: Tuesday, August 30, 2011 12:24:16 PM > Subject: Re: [DRBD-user] Corosync and DRBD fencing: one or both? > > On 08/30/2011 11:25 AM, William Seligman wrote: > > On 8/29/11 4:42 PM, Digimer wrote: > >> On 08/29/2011 03:36 PM, William Seligman wrote: > >>> A general question: I have a Corosync+Pacemaker with DRBD setup > >>> on Linux; I'll > >>> give the details if it's relevant. Corosync+Pacemaker controls > >>> DRBD start, stop, > >>> and promotion. I've implemented fencing via STONITH as Corosync > >>> resources. > >>> > >>> I have not put fencing in the drbd.conf file; I was under the > >>> impression that > >>> Corosync+Pacemaker would take of STONITHing a node if there's a > >>> DRBD problem. Is > >>> this correct? Or should I have fencing/STONITH configured in both > >>> Corosync and > >>> drbd.conf? > >>> > >>> Does the answer change between a primary/secondary versus > >>> dual-primary setup? > >> > >> You still want to configure fencing, but you can use the > >> 'crm-fence-peer.sh' handler. Using this with > >> 'resource-and-stonith' will > >> tell DRBD to block I/O until the fence succeeds, preventing it > >> from > >> going dual-primary (even if just for the brief moment between > >> fault and > >> fence). > > > > I may be dense, but I find the answer ambiguous; perhaps I didn't > > ask the > > question the right way. > > > > Let me ask in a differen way: If I have fencing set up in corosync, > > and corosync > > controls drbd, do I also need fencing in drbd.conf? > > Yes you do. > > There is the potential for a period of time between the fault and > it's > detection by Pacemaker. During this time, if DRBD is not > appropriately > configured, both sides could go StandAlone/Primary. Once that > happens, > you've got a split-brain. > > The 'crm-fence-peer.sh' is used in drbd.conf to let DRBD block IO and > call a fence via pacemaker. This will result in two fence calls, > which > is obviously overkill, but that isn't what we're after. The > corresponding "resource-and-stonith" argument is what matters. That > is > what will block IO at the DRBD level until the fence call succeeds. > My 2 cents: If you have more than just DRBD under Pacemaker you *may* not want to fence a node just because the DRBD connection has failed if other services are still properly functioning... *but* you would still want to prevent DRBD on each node from thinking all was well and going their own separate ways. So I personally let Pacemaker handle STONITH - if comms between nodes fails then STONITH is necessary. However at the DRBD level I use crm-fence-peer for resource only. This way if replication/comms within DRBD breaks one node is fenced preventing it from promoting DRBD resource but the node is not STONITH'd. I personally don't want all my services/different DRBD resources migrating just because something may have gone haywire with a single DRBD resource as opposed to the whole node HTH Jake