Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Saturday 09 June 2007 12:59:08 Joost van den Broek wrote: > Hi, > > I'm currently building a system with two primary nodes with ocfs2 on top > of drbd. [...] > E.g. the cable on eth1 gets disconnected, both hosts will still be > available through eth0, thus load-balancing continues to happen. Of > course, this is very bad behaviour and both hosts will get almost > inconsistent data immediately. > > I would think that there should be some ping check through the other > interface to ensure the other host has died completely, and if it's > still reachable, one host should get the inconsistent status (or even > panic). Or are there other ways to do what I want? ocfs2 will "fence" a host if it loses its connection to that host or to the storage. It does this in earlier versions by panicing, and I believe later versions contain the option of rebooting instead. The downside is, both hosts will probably panic. Upside is, if it's a temporary problem, rebooting might solve it -- the boot scripts will wait for both hosts to come up. It looks like drbd itself likes to rely on heartbeat to handle this kind of situation. There are plenty of options for how to proceed when connectivity is restored, even to the point of panicing one host, but I don't see any options for what happens at the disconnect. It does look like heartbeat could be scripted to do what you want, though, assuming ocfs2 doesn't just panic everything. I would hardcode one host to automatically assume it's the primary (and take over), and the other to automatically die, assuming they can still find each other. On the secondary, you'd do: <insert commands to kill -9 apache or whatever. Also may want to kill anything you find holding the device open (fuser -m).> ifdown eth0 # (on Debian-like systems.) umount /dev/drbd0 drbdadm secondary r0 drbdadm invalidate r0 Because you've brought down one interface on purpose, and the other is down anyway, the primary node should figure out that it's alone, and could be configured to take over the secondary's IP address. If that does happen, you may not even have to reconfigure your load balancing, it'll just "load balance" over the same box. Then, when connectivity is restored (via eth1), you'd just: drbdadm primary r0 mount /dev/drbd0 ifup eth0 <insert commands to start apache or whatever> Disclaimer: I've never used heartbeat, just a poor-man's hack with ping and cron. I may have no clue what I'm talking about. However, docs look pretty thorough over at http://www.linux-ha.org/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 827 bytes Desc: not available URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20070609/bd9efe91/attachment.pgp>