Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
/ 2004-09-24 16:28:24 +0100 \ David Goodwin: > Hi, > > When installing DRBD on a pair of shuttle computers (sk41s), I find that > it functions fine apart from after i simulate a failover. > > > Powering off the primary machine, the secondary takes over ok. But when > the failed machine rejoins the cluster it undertakes the partial resync > and then hangs hard (part of the way through the resync). > > Any suggestions on how I can debug this further? Keyboard is locked up > (caps lock doesn't work etc), and I've tried alt+sysrq+[srbp] etc. No > messages are written out to the console (aside from normal boot up > messages). > > > I'm guessing it's hardware related, but the same systems only appear to > have problems when running drbd. Other mirroring software (e.g. under > Windows) hasn't behaved in a similar manner. > which mirroring software? did you do a memtest? does your via southbrige flip random bits during high DMA load? no further idea at this point. sorry. > Interestingly enough, this behaviour is repeatable (i.e. if i power > cycle the hung machine, and let it boot normally it will hang again, and > again (1 in 4 times it seems to succeed ok, and things go back to normal) > > > I have noticed that if I reboot the hung machine into single user mode, > and then start the network and drbd, it doesn't hang. I've tried > disabling some random init scripts (e.g. hotplug, shmfs etc) but it > doesn't help. > > </sigh> > > OS is SLES8 with the 2.4.21 kernel. DRBD version 0.7.4. (The same > problems were experienced with drbd-0.6.12, although it didn't seem to > be as consistant with regards to failing to undertake the resync after > the failover). > > Suggestions welcome :) > > > thanks > David. Lars Ellenberg -- please use the "List-Reply" function of your email client.