Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi at all, it seems to me that there is some kind of write performance regression that started with DRBD 8.3.12, and is still present in 8.3.13 and 8.4.1. First some details about the testing hardware: 1. two identical servers with an Intel Server S2600CP mainboard 2. each server has 2 physical Xeon E5-2620 CPUs and 64 GB Ram 3. the network connection is a patchcable between the on-board Intel dual I350 Gigabit network card (so no active hardware/switches between both nodes) 4. the servers system is on a separate SSD, for DRBD performance tests I used a dedicated harddisk, there is no other load (I/O, CPU or network) on the server while testing 5. harddisk is a 3 TB Seagate ST3000DM001-9YN166 (firmware CC4H) connected to a Sata III port on the mainboard. 6. OS is a fully patched CentOS 6.2 with DRBD 8.3 packages from elrepo. I started with the recent DRBD 8.3.13, trying to optimize the setup for later production usage. On each harddisk on both sides I created a 3 TB LVM partition (using gdisk, creating a GPT partition table), created a volume group and two seperate logical volumes in it (one for usage without DRBD and one for usage with DRBD to measure differences). Both partitions have a size of 50G. One partition was put setup as DRBD block device with more or less default values (protocol C, al-extents 3389 and sndbuf-size of 0 (auto)). Both (the real partition, as well as the DRBD blockdevice) were formatted using ext4 with default values and then mounted. I started running bonnie++ 1.96 with different values on both filesystems. Most of my later tests were made with "bonnie++ -u root -d /mnt/drbd/ -r 8192 -b -n 0" and looking at the sequential output block results there. Writing on the physical volume I see a block-write rate of around 160-180 MBytes/s (which is to be expected). I then checked the raw tcp network performance with iperf seeing a rate of 992 MBit/s. I then exported the filesystem on the non-drbd volume using NFS, running bonnie++ on the NFS mount. I reached between 100 and 105 MBytes/s there. That is without any further tuning and a reasonable value for a Gigabit network. Then the drbd volume was mounted and I run bonnie++ on it. I only got values between 46 and 52 MBytes/s in several runs, which is only half of the expected rate. I spend a lot of time tuning the following parameters: 1. setting MTU from 1500 to 9000 on both sides 2. switched from CFQ to Deadline scheduler 3. increased net.ipv4.tcp_*mem values to 131072 131072 10485760 4. tried several settings for the deadline scheduler (disabled front_merges, changed read/write-expire, etc) 5. set max-buffers, max-epoch-size and unplug-watermark to 8000 6. mounted my filesystem with barriers=0 7. tried bonnie++ with the secondard DRBD node disconnected 8. reformatted the filesystem with ext2 (this is the only case were I saw an actually different value: the write performance dropped to 38 MByte/s) 9. added "no-tcp-cork" to DRBDs config 10. tries "use-bmbv" in DRBD 11. switched from protocol C to protocol A 12. finally removed the LVM layer by recreating the partitiontable on both sides using parted (still GPT table) and a physical device for DRBD (again 50 GB) Not one of these tweaks changed my write performace, which left me puzzled. I also removed the filesystem from the equation, running "dd" on the drbd blockdevice with several blocksizes and different oflags (direct, fsync, etc), but that also did not bring a write performance above 50 MByte/s. Finally I thought about trying DRBD 8.4.1, but decided to first try a different version without changing the metadata. Therefore I downloaded the oldest RHEL6 package (version 8.3.9) and redid my tests with an ext4 formatted drbddevice. The result was actual different. Just by downgrading from 8.3.13 to 8.3.9 the write performance doubled and reached 100 MByte/s. That is the value I expected on the first place and could nowhere archive with 8.3.13. I then updated to 8.3.10 and then to 8.3.11. The results were still good, reaching between 100 and 105 MByte/s. Finally I installed DRBD 8.3.12: and back was the low write performance (54 MByte/s). I cross checked with DRBD 8.4.1, but the performance was still low (50 MByte/s). As last check I mixed the versions: 1. 8.3.12 on the primary side and 8.3.11 on the secondary side gave me a throughput of 53 MByte/s (still low) 2. 8.3.11 on the primary and 8.3.12 on the secondary side resulted in a new value: 70 MByte/s (that is slightly higher than any value I could archieve with both 8.3.12er versions, but still a big factor lower than it should be). In conclusion: all versions up to and including 8.3.11 give me a good and to be expected write performace. Starting with 8.3.12 something did break, leaving me with only half of the possible write throughput. I should mention that the read performance was never affected and always high. I checked the changelog for 8.3.12, but nothing obviously struck me. Also diffing the sourcetrees 8.3.11->8.3.12 I did not find any obvious. So I want to ask here: anyone seeing that problem too, or maybe having a clue what is going on here. As far as I see I already turned any knob that should have an effect, but nothing gives me a good write performance with 8.3.12 and later. Maybe I missed something very obvious, but then I do not understand why just downgrading the DRBD version doubles the write performance. So, please advise :) I am happy to provide any additional details that might be needed. Regards, Matthias -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 308 bytes Desc: not available URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20120611/c4f5686e/attachment.pgp>