Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Tue, Sep 7, 2010 at 3:37 AM, Lars Ellenberg <lars.ellenberg at linbit.com> wrote: > On Mon, Sep 06, 2010 at 10:02:51PM +0800, jan gestre wrote: >> On Mon, Sep 6, 2010 at 8:48 PM, Lars Ellenberg >> <lars.ellenberg at linbit.com> wrote: >> > On Mon, Sep 06, 2010 at 08:34:40PM +0800, jan gestre wrote: >> >> Hi Everyone, >> >> >> >> I've found this drbddisk modification that will block takeover when >> >> the local resource is not in a safe state, however it only works if >> >> you only have one resource, but since I have two resources namely r0 >> >> and r1, it would not work. >> >> >> >> case "$CMD" in >> >> start) >> >> # forbid to become primary if ressource is not clean >> >> DRBDSTATEOK=`cat /proc/drbd | grep ' cs:Connected ' | grep ' >> >> ds:UpToDate/' | wc -l` >> >> if [ $DRBDSTATEOK -ne 1 ]; then >> >> echo >&2 "drbd is not in Connected/UpToDate state. refusing to >> >> start resource" >> >> exit 1 >> >> fi >> >> >> >> I would be truly grateful if anyone could care to show how to effect >> >> said modification. >> >> >> >> I'm trying to prevent a Split Brain scenario here, and I'm still >> >> testing my setup; I was in a predicament earlier wherein one of the >> >> resource r1 is in healthy state and r0 is in standalone >> >> Primary/Unknown state, I had to issue drdbadm -- --discard-my-data r0 >> >> to resolve the split brain. >> > >> > No Sir. >> > >> > What if the Primary dies? Hard? >> > You now want your Secondary to take over, no? >> > Well, you cannot anymore. Because it is not Connected. >> > How could it, you just lost the peer ;-) >> > >> > Don't focus only on one specific scenario. >> > Because, if you just "fix" that specific scenario, >> > you break a truckload of others. >> > >> > Maybe it helps a bit to read >> > http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg04312.html >> > >> >> Thanks Lars, but now I am confused, maybe you can enlighten me, you're >> saying that I would be better off without modifying, what then would >> you recommend to prevent Split Brain? Add a stonith device, e.g. IBM >> RSA? Add handlers like dopd? >> >> BTW, I got the modification from this url --> >> http://lemonnier.se/erwan/blog/item/53/ > > Which is mislead. > And, it is not an attempt to avoid split brain, > but to avoid diverging data sets, > one of the ill consequences a split brain can lead to. > > What the presented patch does is disable takeover in case the Primary node dies. > So why then have heartbeat, in the first place? > > > > I'll partially quote that blog: > > | Let's take an example: two nodes, N0 and N1. N0 is primary, N1 is secondary. > | Both have redundant heartbeat links and at least one dedicated drbd > | replication link. Let's consider the (highly) hypothetical case when the drbd > | link goes down, soon followed by a power outage for N0. What will happen in a > | standard heartbeat/drbd setup is that when the drbd link goes down, the drbd > | daemon will set the local ressources on both nodes in state 'cs:WFConnection' > | (Waiting For Connection) and mark the peer data as outdated. > > So far that is correct. Where "the drbd daemon" would be dopd. Or, in a > pacemaker cluster, you could also use the crm-fence-peer script to achieve > a similar effect. > > | Then when N0 > | disappears due to the power outage, heartbeat on N1 will takeover ressources > | and become the primary node. > > Which is wrong. > > First, drbd will refuse to be promoted if it is outdated. > So this outdating seems to have not worked in the above setup. > Fix it. > > | What we may want is to forbid a node to become primary in case its drbd > | resources are not in a connected and up-to-date state. > > Which you already have: if it is Outdated, it cannot be promoted. > > > Second, in a properly configured Pacemaker setup, > Pacemaker (resp. the drbd OCF resource agent) would already know, > and not even try to promote it on the outdated node. > > > > Besides, it should be a very unlikely event that a just rebooted, isolated node > decides to take over resources. > > Maybe you should increase your initdead time. > > Or wait for connection before even starting heartbeat/pacemaker. > In the haresources mode heartbeat clusters and using drbddisk, the drbd > wfc-timeout parameter is used for this, and the default for it is "unlimited", > so by default, the drbd init script would in most cases wait forever for drbd > to establish a connection to its peer, thereby blocking the bootup process on > purpose. Heartbeat would only start, once DRBD was able to establish its > connection. > > > > Additionally maybe add a third node, so you have real quorum? > > But it depends on you, and what you want to achieve, of course. > There is no one single best way. > > The pacemaker list post about whether or not a DRBD setup needs STONITH > (I put the link here again) > http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg04312.html > explains why with DRBD (and basically any not shared, but replicated resource) > STONITH alone is NOT SUFFICIENT at all, and resource level fencing alone > is not sufficient if you lose all communication paths at the same time > (or in too quick succession for the chosen outdate mechanism to work). > So if you are paranoid enough, you need both, and real quorum, > and maybe on boot start nothing but sshd. > > And even then, I'm sure someone can come up with a multiple failure scenario, > possibly involving operator failure, to still get diverging data sets ;-) > > > > And, btw, this part does not make sense to me either: > > | if you are using a stonith device, you may want to modify the stonith script > | to forbid stonithing the peer if the local resources are not in > | connected/up-to-date state. There might indeed be a chance that the peer node > | still is functional while the local node definitely is not. > > You need STONITH to make sure that a node that you think is dead (i.e. you can > no longer communicate with -- but you still have the doubt that it may only be > the communication that is broken, not the node) really is dead. > Now you forbid the STONITH operation in case DRBD is not connected, > i.e. not communication with its peer. > Wait. > Wasn't communication failure the only reason you wanted to use STONITH in the > first place? > > -- Many thanks Lars for your very informative response, and yes, that's the only reason I wanted to use STONITH, I'm just using R1 style configuration so I'm not sure if the aforementioned still applies.