Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
I would very much like to try a newer DRBD version, however compiling a custom kernel on these machines which have already been placed in a DC seems risky and makes future maintenance an issue (harder to upgrade). The reboot is a hard-reboot (I added some shutdown actions to /etc/rc6.d to determine this) and could possibly be OCFS2 although I cannot find any documented instances of any such action on its part (or DRBDs of course). I assume it's DRBD only because if i'm 'watch'ing /proc/drbd the crash occurs within the same second as DRBD switches from Primary/Primary to Primary/Unknown and registers the cstate NetworkFailure. It is certainly something within DRBD or OCFS2 although I can totally appreciate this may have been fixed in a more recent version, I only wish I could try it out. Although if it doesn't ring a bell with you Lars it may be something new (although I can't imagine how no one else could have come across it in a similar situation). I'm likely switching to an active-passive configuration as at this version that's presumably the more tried and tested solution and resolves this quickly for my client. Reboot sane, writes to a log file - sends an email, waits a few seconds then issues a standard 'reboot' I will request a KVM serial console to see if there are any messages as you describe, a most helpful suggestion. Thanks again for your response, and apologies for the frantic nature of my posts, i'm under a lot of pressure to get it sorted! :-) Henri Lars Ellenberg wrote: > On Sat, Sep 06, 2008 at 11:37:54PM +0100, Henri Cook wrote: > >> Sorry for another post, it's something i'm working on quite actively. >> >> So the problem then appears to be when a DRBD peer gets rebooted when >> the mount is in use i.e. having a file transferred to it - the system >> gets hard-rebooted (no shutdown actions are run >> > > what makes you so sure about that? > just because you have an "echo >> log" before a reboot does not mean > that echo would make it to disk before the reboot, no? > > what does your "reboot-sane" do? > > do you have a logging serial console hooked up, > so you would see any last second "sorry, I'm fencing myself" message > from OCFS2? > > >> ). Shall I assume this is >> a kernel error or something that's been dealt with and raise a bug with >> the Ubuntu-server team to port a version > 8.0.11? >> > > well, you can, but how about first verify that a newer drbd version > actually fixes it for you? > > -- *Henri Cook* Orion Internet Services Ltd. T: +44 (0) 845 8621431 E: henri at orion-hosting.co.uk Company registration number: 6365589 (registered in England & Wales) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20080907/18c897d9/attachment.htm>