[DRBD-user] drbdadm verify always report oos

Veit Wahlich cru.lists at zodia.de
Thu May 19 09:58:26 CEST 2016

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


Hi,

well, if it still occurs while the resources are not accessed and thus
no data is transferred at all except for the resync and verify, I
suspect a surface or mapping related storage hardware/firmware issue to
be the culprit, as this would also explain that this issue occurs on the
same resources again and again while never on others.

In other thoughts I'd suspect that a rogue process on the host system
alters the backing devices directly, e.g. LVM accessing some DRBD
resources' backing devices. You might want to use the "pvs" command to
verify that LVM does not incorporate the backing devices at all.

Regards,
// Veit

Am Donnerstag, den 19.05.2016, 10:09 +0800 schrieb d tbsky:
> Hi:
> 
>     it is not ssd. it is just two 2TB sata hard disks. mdadm is
> checked every week and I don't see any error report under dmesg.
> there are 20 VMs running above it and they seems normal. but  I wonder
> it will be normal again after so many verify/resync. so I just pick a
> test-vm to try. still trying the config options to see if I can get it
> resync. I can also use dd to resync it but then the problem may
> disappear and I won't know what happened.
> 
>    any suggestions to find out what happened?
> 
> 2016-05-19 4:50 GMT+08:00 Veit Wahlich <cru.lists at zodia.de>:
> > Are you utilising SSDs?
> >
> > Is the kernel log (dmesg) clean from errors on the backing devices (also mdraid members/backing devices)?
> >
> > Did you verify the mdraid array consistency and are the array's members in sync?
> >
> >
> > -------- Ursprüngliche Nachricht --------
> > Von: d tbsky <tbskyd at gmail.com>
> > Gesendet: 18. Mai 2016 18:27:00 MESZ
> > An: Veit Wahlich <cru.lists at zodia.de>
> > CC: drbd-user at lists.linbit.com
> > Betreff: Re: [DRBD-user] drbdadm verify always report oos
> >
> > Hi:
> >     I shutdown the vm when I found the strange behavior. so the drbd
> > is resync under idle situation. I try to play with config options and
> > resync about 15 times, still can not get verify report 0 oos.
> >
> >    I have about 10 resource which has verify problem. but it's strange
> > that some resources are ok. the largest resource is about 1T and it is
> > ok. the resource I am testing now is only 32G.
> >
> >   the host structure is: two sata disk (mdadm raid 1) -> lvm -> drbd
> >
> >   the host has ecc ram so the memory is ok. the most confusing  part
> > is the resync data is very small, it is only a few kilo bytes and sync
> > done in 1 seconds. I don't know why it can not be synced correctly. I
> > also try to run command "md5sum /dev/drbdX" at both node to check.
> > they are indeed different.
> >
> >
> > _______________________________________________
> > drbd-user mailing list
> > drbd-user at lists.linbit.com
> > http://lists.linbit.com/mailman/listinfo/drbd-user
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user





More information about the drbd-user mailing list