Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> > Most of the time (99%) I see ERR for the swap space of virtual machines. > > If you enable "integrity-alg", do you still see those "buffer modified > by upper layers during write"? > > Well, then that is your problem, > and that problem can *NOT* be fixed with DRBD "config tuning". > > What does that mean? > > Upper layer submits write to DRBD. > DRBD calculates checksum over data buffer. > DRBD sends that checksum. > DRBD submits data buffer to "local" backend block device. > Meanwhile, upper layer changes data buffer. > DRBD sends data buffer to peer. > DRBD receives local completion. > DRBD receives remote ACK. > DRBD completes this write to upper layer. > *only now* would the upper layer be "allowed" > to change that data buffer again. > > Misbehaving upper layer results in potentially divergent blocks > on the DRBD peers. Or would result in potentially divergent blocks on > a local software RAID 1. Which is why the mdadm maintenance script > in rhel, "raid-check", intended to be run periodically from cron, > has this tell-tale chunk: > mismatch_cnt=`cat /sys/block/$dev/md/mismatch_cnt` > # Due to the fact that raid1/10 writes in the kernel are > unbuffered, > # a raid1 array can have non-0 mismatch counts even when the > # array is healthy. These non-0 counts will only exist in > # transient data areas where they don't pose a problem. However, > # since we can't tell the difference between a non-0 count that > # is just in transient data or a non-0 count that signifies a > # real problem, simply don't check the mismatch_cnt on raid1 > # devices as it's providing far too many false positives. But by > # leaving the raid1 device in the check list and performing the > # check, we still catch and correct any bad sectors there might > # be in the device. > raid_lvl=`cat /sys/block/$dev/md/level` > if [ "$raid_lvl" = "raid1" -o "$raid_lvl" = "raid10" ]; then > continue > fi > > Anyways. > Point being: Either have those upper layers stop modifying buffers > while they are in-flight (keyword: "stable pages"). > Kernel upgrade within the VMs may do it. Changing something in the > "virtual IO path configuration" may do it. Or not. > > Or live with the results, which are > potentially not identical blocks on the DRBD peers. > Hello Lars, Thank you for the detailed explanation. I've done some more tests and found that "out of sync" sectors appear for master-slave also, not only for master-master. Can you share your thoughts about what can cause upper layer changes in the following schema? KVM (usually virtio) -> LVM -> DRBD -> RAID10 -> Physical drives, while LVM snapshots are not used. Can LVM cause these OOS? Could it help if we replace by the following schema? KVM (usually virtio) -> DRBD -> LVM -> RAID10 -> Physical drives, while LVM snapshots are not used. Stanislav -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20140224/950d5133/attachment.htm>