[DRBD-user] System lockup with DRBD

chambal 2iow-li6l at dea.spamcon.org
Sun Nov 7 07:04:45 CET 2010

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Digimer <linux at alteeve.com> wrote:

>On 10-11-07 12:46 AM, chambal wrote:
>> Thanks for the help.  At the "modprobe drbd" step, the watch
>> window began showing:
>> 
>> version: 8.3.7 (api:88/proto:86-92)
>> srcversion: 582E47DEE6FD9EC45926ECF
>> 
>> And syslog showed:
>> 
>> Nov  6 21:31:35 f13-1 kernel: drbd: initialized. Version: 8.3.7
>> (api:88/proto:86-92)
>> Nov  6 21:31:35 f13-1 kernel: drbd: srcversion: 582E47DEE6FD9EC45926ECF
>> Nov  6 21:31:35 f13-1 kernel: drbd: registered as block device major 147
>> Nov  6 21:31:35 f13-1 kernel: drbd: minor_table @ 0xf6280d80
>> 
>> Then at the "drbadadm attach r0" step, it crashed.  Nothing
>> showed in the syslog or watch windows (it crashed instantly).
>> After reboot, nothing in syslog after the above lines, until the
>> new boot messages.
>
>So that rules out network issues. That's the step where it should start
>talking to that hardware. What is the underlying hardware, exactly? I
>know it's an Epia board... Can I guess these are very old models?

No, these are brand new current-model Mini-ITX from VIA.

>I've used DRBD extensively in the last several months on a wide array of
>hardware (~$300 test nodes to $6k+ servers) and have had problems, but
>never crashes like that. Does the crash happen on both nodes? What else
>is using the underlying disks?

Used an older version of DRBD fine on older hardware.

I can't tell if the crash happens on both nodes since I can't
even get them talking.  It happens on any unit I try to set up
with DRBD.

It's a plain system, nothing special using the disk - it's an SSD
with some partitions, there's no raid or anything, this 1GB
/dev/sda2 partition is created and formatted by the OS installer
but then I remove it from fstab and it's only used for drbd.

>On a hunch, I am going to guess it's a problem with the driver for the
>storage controller. To help confirm or rule out, how much can you
>simplify your nodes? I don't have a lot of experience with 32 bit these
>days, double so with PAE. Can you drop to <3GB of RAM and not use PAE as
>a test? Is there any odd or exotic hardware you can remove?

There's only 2GB RAM.  The only hardware is the Mini-ITX board
and the SATA SSD.

I don't know why the PAE kernel is installed, there was no
installer choice I noticed.  I installed the non-PAE kernel via
yum, changed to make it default in grub.conf, rebooted, rebuilt
DRBD, tried again - showed same info, crashed in the same way.

>I'd also be a bit curious to see if there is a difference should you
>take two drives, make a software RAID 1 array and then use the /dev/mdX
>device as the backing device. That makes it admittedly more complex, but
>perhaps might act as a buffer between DRBD and the storage.

Probably won't try this.

>Can you copy your setup to a different set of hardware, again to test?

Yes I will try to do this, just as a sanity check, but I'm pretty
sure it is the hardware, or drivers.

>I'm throwing mud at a wall to see what sticks...

I'll take mud over nothing!





More information about the drbd-user mailing list