Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> First, too "all of you", > if someone has some spare hardware and is willing to run the test as > suggested by Eric, please do so. > Both "no corruption reported after X iterations" and "corruption reported > after X iterations" is important feedback. > (State the platform and hardware and storage subsystem configuration and > other potentially relevant info) > > Also, interesting question: did you run your non-DRBD tests on the exact > same backend (LV, partition, lun, slice, whatever), or on some other "LV" or > "partition" on the "same"/"similar" hardware? Same hardware. Procedure was as follows: 6 x SSD drives in system. Created 2 x volume groups: vgcreate vg_under_drbd0 /dev/sda5 /dev/sdb5 /dev/sdc5 /dev/sdd5 /dev/sde5 /dev/sdf5 vgcreate vg_without_drbd /dev/sda6 /dev/sdb6 /dev/sdc6 /dev/sdd6 /dev/sde6 /dev/sdf6 Created 2 x LVM arrays: lvcreate -i6 -I4 -l 100%FREE -nlv_under_drbd0 vg_under_drbd0 lvcreate -i6 -I4 -l 100%FREE -nlv_without_drbd vg_without_drbd Started drbd Created an ext4 filesystem on /dev/drbd0 Created an ext4 filesystem on /dev/vg_without_drbd/lv_without_drbd Mounted /dev/drbd0 on /volume1 Mounted /dev/vg_without_drbd/lv_without_drbd on /volume1 Ran TrimTester on /volume1. It failed after writing 700-900 GB on multiple test iterations Ran TrimTester on /volume2. No failure after 20 TB written. > > Now, > "something" is different between test run with or without DRBD. > > First suspect was something "strange" happening with TRIM, but you think > you can rule that out, because you ran the test without trim as well. > > The file system itself may cause discards (explicit mount option "discard", > implicit potentially via mount options set in the superblock), it does not have > to be the "fstrim". The discard option was not explicitly set. I'm not sure about implicitly. > > Or maybe you still had the fstrim loop running in the background from a > previous test, or maybe something else does an fstrim. > > So we should double check that, to really rule out TRIM as a suspect. > Good thought, but I was careful to ensure that the shell script which performs the trim was not running. > You can disable all trim functionality in linux by echo 0 > > /sys/devices/pci0000:00/0000:00:01.1/ata2/host1/target1:0:0/1:0:0:0/block/s > r0/queue/discard_max_bytes > (or similar nodes) > > something like this, maybe: > echo 0 | tee /sys/devices/*/*/*/*/*/*/block/*/queue/discard_max_bytes > > To have that take effect for "higher level" or "logical" devices, you'd have to > "stop and start" those, so deactivate DRBD, deactivate volume group, > deactivate md raid, then reactivate all of it. > > double check with "lsblk -D" if the discards now are really disabled. > > then re-run the tests. > Okay, I will try that. > > In case "corruption reported" even if we are "certain" that discard is out of > the picture, that is an important data point as well. > > What changes when DRBD is in the IO stack? > Timing (when does the backend device see which request) may be changed. > Maximum request size may be changed. > Maximum *discard* request size *will* be changed, which may result in > differently split discard requests on the backend stack. > > Also, we have additional memory allocations for DRBD meta data and > housekeeping, so possibly different memory pressure. > > End of brain-dump. > > In the meantime, I tried a different kind of test, as follows: ha11a:~ # badblocks -b 4096 -c 4096 -s /dev/drbd0 -w Testing with pattern 0xaa: done Reading and comparing: done Testing with pattern 0x55: done Reading and comparing: done Testing with pattern 0xff: done Reading and comparing: done Testing with pattern 0x00: done Reading and comparing: done Of course, /dev/drbd0 was unmounted at the time. It ran for 16 hours and reported NO bad blocks. I'm not sure if this provides any useful clues. -Eric