[DRBD-user] System lockup with DRBD
chambal
2iow-li6l at dea.spamcon.org
Sun Nov 7 07:04:45 CET 2010
Digimer <linux at alteeve.com> wrote:
>On 10-11-07 12:46 AM, chambal wrote:
>> Thanks for the help. At the "modprobe drbd" step, the watch
>> window began showing:
>>
>> version: 8.3.7 (api:88/proto:86-92)
>> srcversion: 582E47DEE6FD9EC45926ECF
>>
>> And syslog showed:
>>
>> Nov 6 21:31:35 f13-1 kernel: drbd: initialized. Version: 8.3.7
>> (api:88/proto:86-92)
>> Nov 6 21:31:35 f13-1 kernel: drbd: srcversion: 582E47DEE6FD9EC45926ECF
>> Nov 6 21:31:35 f13-1 kernel: drbd: registered as block device major 147
>> Nov 6 21:31:35 f13-1 kernel: drbd: minor_table @ 0xf6280d80
>>
>> Then at the "drbadadm attach r0" step, it crashed. Nothing
>> showed in the syslog or watch windows (it crashed instantly).
>> After reboot, nothing in syslog after the above lines, until the
>> new boot messages.
>
>So that rules out network issues. That's the step where it should start
>talking to that hardware. What is the underlying hardware, exactly? I
>know it's an Epia board... Can I guess these are very old models?
No, these are brand new current-model Mini-ITX from VIA.
>I've used DRBD extensively in the last several months on a wide array of
>hardware (~$300 test nodes to $6k+ servers) and have had problems, but
>never crashes like that. Does the crash happen on both nodes? What else
>is using the underlying disks?
Used an older version of DRBD fine on older hardware.
I can't tell if the crash happens on both nodes since I can't
even get them talking. It happens on any unit I try to set up
with DRBD.
It's a plain system, nothing special using the disk - it's an SSD
with some partitions, there's no raid or anything, this 1GB
/dev/sda2 partition is created and formatted by the OS installer
but then I remove it from fstab and it's only used for drbd.
>On a hunch, I am going to guess it's a problem with the driver for the
>storage controller. To help confirm or rule out, how much can you
>simplify your nodes? I don't have a lot of experience with 32 bit these
>days, double so with PAE. Can you drop to <3GB of RAM and not use PAE as
>a test? Is there any odd or exotic hardware you can remove?
There's only 2GB RAM. The only hardware is the Mini-ITX board
and the SATA SSD.
I don't know why the PAE kernel is installed, there was no
installer choice I noticed. I installed the non-PAE kernel via
yum, changed to make it default in grub.conf, rebooted, rebuilt
DRBD, tried again - showed same info, crashed in the same way.
>I'd also be a bit curious to see if there is a difference should you
>take two drives, make a software RAID 1 array and then use the /dev/mdX
>device as the backing device. That makes it admittedly more complex, but
>perhaps might act as a buffer between DRBD and the storage.
Probably won't try this.
>Can you copy your setup to a different set of hardware, again to test?
Yes I will try to do this, just as a sanity check, but I'm pretty
sure it is the hardware, or drivers.
>I'm throwing mud at a wall to see what sticks...
I'll take mud over nothing!
More information about the drbd-user
mailing list