Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hi, have you tried with elevator=deadline? what does "cat /sys/block/dm-*/queue/scheduler" show? On Fri, 16 Dec 2011 13:27:55 +0100, Volker <mail at blafoo.org> wrote: > Hi, > >>> [root at nfs01 nfs]# cat /proc/drbd >>> version: 8.3.8 (api:88/proto:86-94) >> >> Really do an upgrade! ... elrepo seems to have latest DRBD 8.3.12 >> packages > > Thanks for the hint, we might consider that if nothing else helps :-) > > Not that we dont want the newer version. Its the unofficial repository > that is the problem here. We are quite hesitant of unofficial repos, > because that systems hosts hundreds of customers. > >>> Why these resyncs happen and so much data is being resynced, is another >>> case. The nodes were disconnected for 3-4 Minutes which does not justify >>> so much data. Anyways... >> >> If you adjust your resource after changing an disk option the disk is >> detached/attached ... this means syncing the complete AL when done on a >> primary ... 3833*4MB=15332MB > > Great! Thanks for the insight. Im really learning some stuff about drbd > here! > >>> After issueing the mentioned dd command >>> >>> $ dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 >>> 10240+0 records in >>> 10240+0 records out >>> 41943040 bytes (42 MB) copied, 0.11743 seconds, 357 MB/s >> >> you benchmark your page cache here ... add oflag=direct to dd to bypass >> it > > Now this makes me shiver and lough at the same time (shortened the output): > > #### > [root at nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > 41943040 bytes (42 MB) copied, 24.7257 seconds, 1.7 MB/s > > [root at nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 25.9601 seconds, 1.6 MB/s > > [root at nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 44.4078 seconds, 944 kB/s > > [root at nfs01 nfs]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 30384128 bytes (42 MB) copied, 26.9182 seconds, 1.3 MB/s > #### > > The load rises a little while doing this (to about 3-4), but the systems > remains usable. > >> looks like I/O system or network is fully saturated > > It seems more like some sort of drbd-cache-setting is broken somewhere. > > On an LVM-Volume without DRBD dd works fine (i shortened the output): > > #### > [root at nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 0.738741 seconds, 56.8 MB/s > > [root at nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 0.746778 seconds, 56.2 MB/s > > [root at nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 0.733518 seconds, 57.2 MB/s > > [root at nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 0.736617 seconds, 56.9 MB/s > > [root at nfs01 mnt]# dd if=/dev/zero of=./test-data.dd bs=4096 count=10240 > oflag=direct > 41943040 bytes (42 MB) copied, 0.73078 seconds, 57.4 MB/s > #### > > The network link is also just fine. We've tested this with almost > 100MB/s (that is Megabytes) of throughput. The only possible limit here > would be the syncer rate of 25MB/s, but the network-link is only > saturated during a resync. > > Any more ideas with this info? > > best regards > volker > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user