Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Fri, Sep 05, 2008 at 10:15:49PM +0100, Henri Cook wrote: > Hi all, I have a bizarre problem i'm hoping you can help me with. > > Node A and Node B have /dev/drbd0 mounted in Primary-Primary on /shared > > If Node B reboots, Node A stays online with the drive mounted and > resyncs normally upon it's return. > > IF however, there is an FTP transfer in progress to /shared on Node A > when Node B gets rebooted, as soon as Node A loses the DRBD connection > (Primary/Unknown) it chooses to reboot itself also. > > This obviously means my HA setup is going down in a sort of chain > reaction when under load - have i missed some obvious on-net-loss reboot > type option? > > UPDATE: It appears that when i just reboot Node B with no active > transfer, it registers as 'WFConnection' whereas if I reboot with an > active transfer it registers as a 'NetworkFailure' - safe to assume > then that default NetworkFailure behaviour is to reboot - can anyone > tell me how to change this?? No. DRBD does no such thing. "NetworkFailure" is just one of the normal transitional states drbd goes through if the replication link "goes away" unexpected for whatever reason, and is expected to settle "quickly" to one of the less transient states like StandAlone or WFConnection. DRBD on occasions calls "user space helpers" called handlers. verify if you have any halt/reboot/switchoffs configured there. (though I don't see a reason for any of those helpers to be called in this scenario). OCFS2 does "ping to disk" and "self fencing" if within a (configurable) timeout this "ping to disk" cannot be served. maybe that is your problem? you could also update to drbd 8.0.13, and see if that helps. ad debugging messages of drbd: you can make DRBD very noisy by echoing some values into /sys/module/drbd/parameters/trace* for the values see drbd_int.h in the drbd sources, search for TraceLevel and TraceType. I don't think that will help you. but you asked for it (off-list). -- : Lars Ellenberg : LINBIT HA-Solutions GmbH : DRBD®/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT Information Technologies GmbH __ please don't Cc me, but send to list -- I'm subscribed