Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
First, too "all of you", if someone has some spare hardware and is willing to run the test as suggested by Eric, please do so. Both "no corruption reported after X iterations" and "corruption reported after X iterations" is important feedback. (State the platform and hardware and storage subsystem configuration and other potentially relevant info) Also, interesting question: did you run your non-DRBD tests on the exact same backend (LV, partition, lun, slice, whatever), or on some other "LV" or "partition" on the "same"/"similar" hardware? Now, "something" is different between test run with or without DRBD. First suspect was something "strange" happening with TRIM, but you think you can rule that out, because you ran the test without trim as well. The file system itself may cause discards (explicit mount option "discard", implicit potentially via mount options set in the superblock), it does not have to be the "fstrim". Or maybe you still had the fstrim loop running in the background from a previous test, or maybe something else does an fstrim. So we should double check that, to really rule out TRIM as a suspect. You can disable all trim functionality in linux by echo 0 > /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/sr0/queue/discard_max_bytes (or similar nodes) something like this, maybe: echo 0 | tee /sys/devices/*/*/*/*/*/*/block/*/queue/discard_max_bytes To have that take effect for "higher level" or "logical" devices, you'd have to "stop and start" those, so deactivate DRBD, deactivate volume group, deactivate md raid, then reactivate all of it. double check with "lsblk -D" if the discards now are really disabled. then re-run the tests. In case "corruption reported" even if we are "certain" that discard is out of the picture, that is an important data point as well. What changes when DRBD is in the IO stack? Timing (when does the backend device see which request) may be changed. Maximum request size may be changed. Maximum *discard* request size *will* be changed, which may result in differently split discard requests on the backend stack. Also, we have additional memory allocations for DRBD meta data and housekeeping, so possibly different memory pressure. End of brain-dump. -- : Lars Ellenberg : LINBIT | Keeping the Digital World Running : DRBD -- Heartbeat -- Corosync -- Pacemaker : R&D, Integration, Ops, Consulting, Support DRBD® and LINBIT® are registered trademarks of LINBIT