[DRBD-user] drbdadm verify always report oos

d tbsky tbskyd at gmail.com
Wed May 18 18:27:00 CEST 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

    I shutdown the vm when I found the strange behavior. so the drbd
is resync under idle situation. I try to play with config options and
resync about 15 times, still can not get verify report 0 oos.

   I have about 10 resource which has verify problem. but it's strange
that some resources are ok. the largest resource is about 1T and it is
ok. the resource I am testing now is only 32G.

  the host structure is: two sata disk (mdadm raid 1) -> lvm -> drbd

  the host has ecc ram so the memory is ok. the most confusing  part
is the resync data is very small, it is only a few kilo bytes and sync
done in 1 seconds. I don't know why it can not be synced correctly. I
also try to run command "md5sum /dev/drbdX" at both node to check.
they are indeed different.

2016-05-18 23:58 GMT+08:00 Veit Wahlich <cru.lists at zodia.de>:
> Hi,
> how did you configure die VMs' disk caches? In case of qemu/KVM/Xen it
> is essential for consistency to configure cache as "write through", any
> other setting is prone to problems due to double-writes, unless the OS
> inside of the VM uses write barriers.
> Although write barriers are default for many Linux distributions, it is
> often disabled within VMs for performance reasons, e.g. by the
> virt-guest profine in tuned.
> Also Linux swap does not support write barriers at all, meaning that
> migrating a VM might not cause file system inconsistencies but memory
> corruptions inside a VM, leading to unpredictable results.
> Windows OS is also very prone to double-write problems.
> If you use any VM cache configuration other than write through, please
> consider switching to write through. VMs will need to be restarted.
> After all VMs have been restarted, use verify, disconnect, connect to
> get rid of oos sectors and check using verify if they occur again.
> When migrating VMs between hosts, you may ignore warnings stating write
> through cache to be unsafe for migration.
> Google for this issue for further information.
> Regards,
> // Veit
> Am Mittwoch, den 18.05.2016, 18:21 +0800 schrieb d tbsky:
>> hi:
>>     I am using drbd 8.4.7 which comes from epel under scientific linux
>> 7.2.  when I try "drbdadm verify res", it report there is oos.
>>    so I disconnect/connect the resource, the oos now becomes 0. but
>> when I verify it again, it report oos again.  the oos amount is
>> different than previous, but the oos sector numbers are simliar.
>>    I also try "invalidate-remote" to resync all and then verify, but
>> it still report oos.
>>     I don't know what happened, it seems my cluster has big problems
>> with data consistency. but all the vm running above the drbd resources
>> seems fine, and I migrate them between hosts many times.
>>    is the behavior normal or I should replace the hardware now?
>>    thanks for help!!
>> _______________________________________________
>> drbd-user mailing list
>> drbd-user at lists.linbit.com
>> http://lists.linbit.com/mailman/listinfo/drbd-user
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user

More information about the drbd-user mailing list