Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Ben, from what little amount of information you have given, I can only guess what issue you really ran into. But what you have shared sounds a bit like a resource starvation deadlock issue that was fixed in 8.0.8. To work around in an 8.0.6 cluster, see if changing max-buffers to something considerably higher than the default causes the stall to disappear. Try 40000 (five zeroes). Upgrading to 8.0.8 is recommended, though. But as I said, you didn't give much information, so I'm reduced to guessing. If the above suggestion doesn't work, please post a full description of your issue, including your /etc/drbd.conf, and pertinent log snippets. But, there appear to be some misconceptions about DRBD in place here that I'd like to clarify. > >> I am running 8.0.6. I had a complete failure of a server. On reboot > >> both my drbd nodes started a re-sync, and then jumped to 'stale', where > >> they stuck indefinitely. > This is correct is was 'stalled'. (In my panic to get our servers > running, I didn't take a copy of /proc/drbd at the time :) "Both nodes started a re-sync" is the wrong wording here. _DRBD_ started a resync, which means one node became SyncSource and the other SyncTarget. Which in turn means that only the data on the SyncTarget is considered Inconsistent. The SyncSource which has the UpToDate disk is perfectly usable at this time. You can make it Primary, mount it, run your application as normal. Thus, there is no reason to panic "to get servers running" again. You can run your DRBD-enabled services while the resync is in progress -- even if it's in fact not progressing. :-) Hope this helps. Cheers, Florian -- : Florian G. Haas : LINBIT Information Technologies GmbH : Vivenotgasse 48, A-1120 Vienna, Austria