[DRBD-user] "PingAck not received" messages

Matthew Bloch matthew at bytemark.co.uk
Thu May 24 13:53:04 CEST 2012

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


On 24/05/12 12:30, Florian Haas wrote:
> On Wed, May 23, 2012 at 9:57 PM, Matthew Bloch <matthew at bytemark.co.uk> wrote:
>>> "drbdsetup /dev/drbdX show" to a pastebin please?
>>
>> It's not that long, here's one:
>>
>> $ sudo drbdsetup /dev/drbd0 show
>> disk {
>>         size                    0s _is_default; # bytes
>>         on-io-error             pass_on _is_default;
>
> You want to change that to "detach". Unrelated to your PingAck problem, though.
>
>>         fencing                 dont-care _is_default;
>
> This is a bad idea too, but since you're evidently not using a cluster
> manager at all (which happens to be a bad idea as well), it probably
> doesn't make that much of a difference. Again, unrelated to PingAck
> issues.

Hmm, thanks.  Unrelated to any of this, the v3a kernel (Debian 2.6.32-4) 
crashed pretty badly 48hrs ago.  Since it has been rebooted - there have 
been no "PingAck not received" messages.

So assuming we get a week free of these messages, I'm guessing there was 
a drbd bug of some kind but the reboot cleared it up.

We are preparing to jump to a 2.6.32 sourced from CentOS because this 
Debian kernel seems to crash with one bug or another every few months.

The reason we're using external meta-devices is for backup: without the 
metadata at the end, the underlying disk image represents exactly what 
the VMs see.  We can then snapshot this and take a reasonably consistent 
backup without bothering DRBD.  We later verify this backup by booting 
it back up, disconnected, and taking a snapshot of the VNC console!

The reason I picked protocol B is because LVM snaphots kill the local 
DRBD performance if we snapshot the LVM device underlying the DRBD 
Primary.  If we snapshot the Secondary and used protocol B where we 
weren't dependent on local write speeds, my working theory was that the 
performance hit wouldn't be as noticeable, and the customer seemed to 
concur (previously we were using C).

-- 
Matthew



More information about the drbd-user mailing list