Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Yet more information on this problem. I've found if I remove one of the disks from the software RAID1 on the Secondary in this DRBD resource, the transfer speeds jump up to what I would call acceptable levels. I've tested each of the disks independently and they are both fine. It seems there is some sort of strange interaction going on between DRBD, the md layer, and perhaps the Adaptec 2020ZCR card (but I'm not sure). With so many layers of I/O abstraction it is hard to know what is going on and what particular thing is causing the problem... Any bells ringing for anyone? On Wed Jun 18, 2008 at 15:39:01 +1000, Oliver Hookins wrote: >Another snippet of information that might twig someone's memory... I took a >tcpdump of DRBD traffic when doing a large file write and although the MTU >is set to 9000 over the direct 1Gbps connection, both systems have their TCP >windows set to very small values, such as around 800. > >During a 10 second packet capture I'm also seeing 25 TCP out-of-order >segments and 1427 TCP Window updates, which seems to be very high. I've >already had a go at raising TCP buffers in /proc/sys/net/core and >/proc/sys/net/ipv4 but without any noticeable change in connected speed... > >On Wed Jun 18, 2008 at 12:39:14 +1000, Oliver Hookins wrote: >>Anybody have any tips at all for this issue? I'm running out of ideas... >> >>On Thu Jun 12, 2008 at 16:04:10 +1000, Oliver Hookins wrote: >>>Hi again, >>> >>>I've been doing a lot of testing and I'm fairly certain I've narrowed down >>>my performance issues to the network connection. Previously I was getting >>>fairly abysmal performance in even DRBD-disconnected mode but I realise now >>>this was mainly due to my test file size far exceeding the al-extents >>>setting. >>> >>>I am performing dd tests (bs=1G, count=8) with syncs on the connected DRBD >>>resources and getting about 10MB/s only. The disks are 10krpm 300GB SCSI and >>>can easily get sustained speeds of 60-70MB/s when DRBD is disconnected or >>>not used. There is a direct cable between the machines giving them full >>>gigabit connectivity via their Intel 80003ES2LAN adaptors (running the e1000 >>>driver version 7.3.20-k2-NAPI that is standard with RHEL4 x86_64). I have tested >>>this connection with Netpipe and get up to 940Mbps. >>> >>>However DRBD still crawls along at 10MB/s. I have attempted to increase the >>>/proc/sys/net/core/{r,w}mem_{default,max} settings which were previously all >>>at 132KB, to 1MB for defaults and 2MB for max without any increase in >>>performance. MTU on the link is set to 9000 bytes. >>> >>>In drbd.conf I have sndbuf-size 2M; max-buffers 8192; max-epoch-size 8192. >>>I've also played a little with the unplug watermark setting it to very low >>>and very high values without any apparent change. >>> >>>Taking a look at a tcpdump of the traffic the only weird things I could see >>>are a lot of TCP window size change notifications and some strange packet >>>"clumping", but it's not really offering me any insights I can immediately >>>see. >>> >>>Is there anything else I could tune to solve this problem? -- Regards, Oliver Hookins