Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
>Then on giveback, the system started doing a full sync:
>Jul 28 08:45:55 dbtools01 kernel: drbd0: Resync started as SyncTarget (need to sync 146669568 KB [36667392 bits set]).
>
>Is this expected behaviour? I would have expected that the quick resync
>would have taken place.
>
>
I'am not answering your question but it's related. We try to stay away
from the DiskLessClient state because of the resulting FullSync by using
the panic on I/O-error configurable:
disk { on-io-error panic; }
Complementary kernel configurables to make it panic on any error (to
avoid losing the BitMap):
# sysctl -a | grep panic
kernel.panic_on_oops = 1
kernel.panic = 1
It doesnt look nice but it works very good - we have failed-over very
large devices quite often and never had a FullSync. Since we're primarly
exporting HighAvailable NFSv3 the clients can continue there work like
nothing happend (using nfs-over-UDP - I still have to see if a
nfs-over-TCP client can handle a fail-over - if that's the case i'll
switch to using to TCP in order to avoid the high loads due
packet-reassemabling). But back to your question.. I guess the
state/health of the device which has been detached must indeed default
to a very unclean state - drbd doesnt know if you have been altering the
device while it where not in use by drbd.
--Leroy