Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi Roland, I guess that you've got a 64k stripe RAID. If IO isn't aligned to the stripe size it can be slow. The idle HDDs indicate that DRBD is doing something wrong. Do you have caching enabled on your RAID controller? A RAID controller with caching should be able to merge too small IO requests together before dispatching them to the HDDs. Furthermore, there is an IO request size limits bug in 8.4.1 as well as in <= 8.3.13. http://git.drbd.org/gitweb.cgi?p=drbd-8.3.git;a=commit;h=3a2911f0c7ab80d061a5a6878c95f6d731c98554 http://git.drbd.org/gitweb.cgi?p=drbd-8.4.git;a=commit;h=4436a2e67e3854969425e0e02bf1b41bad34f69d Therefore, I really suggest you to trace the block sizes with "blktrace" to see what's really going on on the block layer. Here is how you do it: 1. install "blktrace" package - your kernel should support blktracing 2. # blktrace /dev/sdX -b 4096 & 3. # pid=$! 4. # dd ... of=/dev/sdX bs=1M ... 5. # kill -2 $pid 6. # blkparse sdX | less When parsing you should see something like this: 8,0 1 177 34.431144943 1275 Q WS 94666752 + 1024 [dd] "Q" means IO is queued, "W" it is a write, "+ 1024": 512 KiB were written to the queue - this is measured in sectors and a sector is 512 Byte long on common HDDs. Do the tracing on your RAID device without DRBD above first. Then, do the tracing for your DRBD device. I really guess that your issue is on the blkio layer. But yes - there are general network statistics. You could use "iftop" for example. But when writing only 4 KiB for example on the block layer, then also only 4 KiB are sent through the network layer. Cheers, Sebastian On 26.09.2012 16:18, Roland Kaeser wrote: > Hello Sebastian > > Thanks for the hint. The be sure that the settings are exactly the > same on both sides, I wrote a script to set these settings by ssh on > both nodes equally. So the current value for max_sectors_kb is: 64 on > both nodes. But I played around with these values and nothing helped. > When the nodes are connected, the write speed is even lower than in > single node mode. I think there must be some kind of bottleneck > between the blocksizes / ioscheduler, etc. and drbd but could find the > igniting hint. > > Strange is also that the sync speed is very low (triggered manually > resync to retest that) 66MB /s. All tests with different config > settings showed no performance changement. blktrace also shows > nothing more flashy which could lead to more. while resyncing and > dd'ing the iostat shows 1.07% iowat which is imho very low value for a > system under full io load. > > Is there a possibility to debug the network usage and internal network > stats of drbd besides drbdsetup show 0? > > Regards > > Roland