[DRBD-user] Tales of woe - possible Debian Squeeze kernel bugs, possible DRBD bugs, possible Xen bugs...

Florian Haas florian at hastexo.com
Wed Jan 11 22:14:43 CET 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On Wed, Jan 11, 2012 at 5:56 PM, Adam Wilbraham
<adam.wilbraham at technophobia.com> wrote:
> I've spent the past couple of days trying to get a pair of servers
> into a state of stability and thought I would try and get down my
> issues into a post & possible bug report whilst I have them in my
> head. I'm probably going to miss some bits of information out, but
> I'll try and get down what I remember. Its been a busy couple of days
> & the combinations I've tried might mean that none of this is useful
> for debugging because I haven't logged down as much as I should have
> done as I was going through the troubleshooting process as I've been
> blocking colleagues and therefore against the clock.
>
> I started out with a pair of HP DL360 G6's which were built
> approximately a year ago, running Debian Squeeze (before it became
> stable) with Xen 4.0 on top all from Apt and with DRBD 8.3.10 built
> from source (I believe this was stable at the time). The pair of
> servers have only been used as internal development hosts and were
> never patched up when Debian went stable, so the kernel version was a
> little out of date
> (xen-linux-system-2.6.32-5-xen-amd64_2.6.32-30_amd64.deb) as were
> other packages. Over the last year the pair have been stable almost
> all of the time, but we did have a couple of incidents where the pair
> would reboot in tandem but because they weren't business critical the
> resolution of this was never a priority.
>
> Anyway, we moved the servers to a new location over the weekend and
> almost from the point of power up this parallel reboot issue reared
> its head. If the servers were sat there idling they would be fine, but
> the minute I started to boot up domUs I began the risk of it
> happening. Normally the more domUs running, the more likely it was to
> kick a reboot. It seemed like it was most likely to happen when
> starting another domU rather than just doing its running of online
> VMs. Anyway, I eventually narrowed this down to a point where I
> realised that if I unplugged the network cable that was being used for
> DRBD replication then the server would spew output to the screen and
> reboot instantly, with the other one in the pair going about a second
> later.

You didn't have the chance to hook up a serial terminal and capture
the log messages that way, I suppose?

Any idea whether you were getting a kernel panic, or an oops?

And, just checking, you did follow
http://www.drbd.org/users-guide-8.3/s-xen-drbd-mod-params.html?

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now



More information about the drbd-user mailing list