[DRBD-user] DRBD failed - went to 'stale'.
Florian Haas
florian.haas at linbit.com
Mon Dec 10 14:10:30 CET 2007
Ben,
from what little amount of information you have given, I can only guess what
issue you really ran into. But what you have shared sounds a bit like a
resource starvation deadlock issue that was fixed in 8.0.8. To work around in
an 8.0.6 cluster, see if changing max-buffers to something considerably
higher than the default causes the stall to disappear. Try 40000 (five
zeroes). Upgrading to 8.0.8 is recommended, though.
But as I said, you didn't give much information, so I'm reduced to guessing.
If the above suggestion doesn't work, please post a full description of your
issue, including your /etc/drbd.conf, and pertinent log snippets.
But, there appear to be some misconceptions about DRBD in place here that I'd
like to clarify.
> >> I am running 8.0.6. I had a complete failure of a server. On reboot
> >> both my drbd nodes started a re-sync, and then jumped to 'stale', where
> >> they stuck indefinitely.
> This is correct is was 'stalled'. (In my panic to get our servers
> running, I didn't take a copy of /proc/drbd at the time :)
"Both nodes started a re-sync" is the wrong wording here. _DRBD_ started a
resync, which means one node became SyncSource and the other SyncTarget.
Which in turn means that only the data on the SyncTarget is considered
Inconsistent.
The SyncSource which has the UpToDate disk is perfectly usable at this time.
You can make it Primary, mount it, run your application as normal. Thus,
there is no reason to panic "to get servers running" again. You can run your
DRBD-enabled services while the resync is in progress -- even if it's in fact
not progressing. :-)
Hope this helps.
Cheers,
Florian
--
: Florian G. Haas
: LINBIT Information Technologies GmbH
: Vivenotgasse 48, A-1120 Vienna, Austria
More information about the drbd-user
mailing list