Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Sat, Sep 13, 2008 at 04:00:34AM +0200, Lars Marowsky-Bree wrote: > I'm sorry, I wasn't aware that that was what you were looking for, and > the web page describes all scenarios when pacemaker delivers a > notification to an RA (basically, whenever the peer changes state). and none of them is useful for _outdating_. some of them may be useful for "unfreezing". > > Iff I'd get a signal in the RA with the appropriate meaning > > at the appropriate time, I'd just say "drbdadm outdate resource". > > that is what dopd does now. > > I think that "outdate" mechanism as it stands today might need some > minor changes, yes. Just as the logic in the RA surely needs to, and > possibly we even need to improve m/s if we find a lack there. so. what you are suggeting is when drbd loses replication link primary freezes and calles out to userland, telling heartbeat that the peer has "failed", in which case heartbeat would stop drbd on the secondary. either receives "secondary was stopped", maybe stores to meta data "_I_ am ahead of peer", (useful for cluster wide crash/reboot later) and unfreezes or is being stopped itself (which would result in the node being self fenced, as the fs on top of drbd cannot be unmounted as drbd is freezed,...) or is even being shot as result of a cluster partition. so either primary continues to write, or it will soon look like a crashed primary. secondary sets a flag "primary may be ahead of me", then waits for either being stopped, in which case it would save to meta data "primary _IS_ ahead of me" or being told that the Primary was stopped when it would clear that flag again, maybe store to meta data "_I_ am ahead of peer" and then most likely soon after be promoted. while drbd has the "peer may be ahead of me" flag set, i.e. basically while drbd is not connected and no "certain" flag is set yet, it will refuse to be promoted. Did I get that right? [note that drbd has both "certain" flags already implemented, namely "I am outdated" = peer IS ahead of me, and "peer is outdated" = _I_ am ahead of peer ] some questions: wouldn't that "peer has failed" first trigger a monitor? wouldn't that mean that on monitor, a not connected secondary would have to report "failed", as otherwise it would not get stopped? wouldn't that prevent normal failover? if not, wouldn't heartbeat try to restart the "failed" secondary? what would happen? what does a secondary do when started, and it finds the "primary IS ahead of me" flag in meta data? refuse to start even as slave? (would prevent it from ever being resync'ed!) start as slave, but refuse to be promoted? [note that typical DRBD cluster deployment is still 2node, in case that matters] problem: secondary crash. secondary reboots, heartbeat rejoins the cluster. replication link is still broken. secondary does not have "primary IS ahead of me" flag in meta data as because of the crash there was no way to store that. would heartbeat try to start drbd (slave) here? what would trigger the "IS ahead of me" flag get stored on disk? if for some reason policy engine now figures the master should rather run on the just rejoined node, how can that migration be prevented? and so on and on. there are many scenarios. I'm still not convinced that this method covers as many as dopd as good as dopd. but, at least, it is getting closer... -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed