Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On Thu, May 09, 2013 at 10:33:16AM +0800, Mia Lueng wrote: > # sysctl -a|grep dirty > vm.dirty_background_ratio = 10 > vm.dirty_background_bytes = 0 > vm.dirty_ratio = 20 > vm.dirty_bytes = 0 > vm.dirty_writeback_centisecs = 500 > vm.dirty_expire_centisecs = 3000 > > bandwidth is 100M bps You can replicate around 10 to 12 MByte per second. To avoid long "write-out stalls" when flushing caches, you should not allow more than about 20 MByte dirty, and start write out much earlier. vm.dirty_bytes=20100100 vm.dirty_background_bytes=500100 vm.dirty_writeback_centisecs=97 A ratio of 20 % of available RAM may well mean several GB. How much RAM do you have? Depending on what usage patterns and data characteristics you actually have in production, maybe you want to try drbd-proxy. Or check with LINBIT what other options you have. > 2013/5/9 Lars Ellenberg <lars.ellenberg at linbit.com> > > > On Thu, May 09, 2013 at 12:16:56AM +0800, Mia Lueng wrote: > > > in drbd 8.4.3,I do the following test: > > > > > > [root at kvm3 drbd.d]# drbdadm dump drbd0 > > > # resource drbd0 on kvm3: not ignored, not stacked > > > # defined at /etc/drbd.d/drbd0.res:1 > > > resource drbd0 { > > > on kvm3 { > > > device /dev/drbd0 minor 0; > > > disk /dev/vg_kvm3/drbd0; > > > meta-disk internal; > > > address ipv4 192.168.10.6:7700; > > > } > > > on kvm4 { > > > device /dev/drbd0 minor 0; > > > disk /dev/vg_kvm4/drbd0; > > > meta-disk internal; > > > address ipv4 192.168.10.7:7700; > > > } > > > net { > > > protocol A; > > > csums-alg md5; > > > verify-alg md5; > > > ping-timeout 30; > > > ping-int 30; > > > max-epoch-size 8192; > > > max-buffers 8912; > > > unplug-watermark 131072; > > > } > > > disk { > > > on-io-error pass_on; > > > disk-barrier no; > > > disk-flushes no; > > > resync-rate 100M; > > > c-plan-ahead 20; > > > c-delay-target 100; > > > c-max-rate 400M; > > > c-min-rate 2M; > > > al-extents 601; > > > } > > > } > > > > > > [root at kvm3 oradata]# dd if=t1 of=t2 bs=1M > > > 5585+1 records in > > > 5585+1 records out > > > 5856305152 bytes (5.9 GB) copied, 286.119 s, 20.5 MB/s > > > > That writes to the page cache, and from there to the block device. > > > > No fsync, no sync: there will still be a few GB in the cache (RAM only). > > > > > [root at kvm3 oradata]# cd > > > [root at kvm3 ~]# umount /oradata > > > > > > > > > it takes lots of time(up to 600 seconds) to umount the drbd mount point. > > > > On umount, the filesystem obviously has to flush all dirty pages first. > > > > What is your replication bandwidth? > > > > > echo "1" >/proc/sys/vm/block_dump > > > show when umount , > > > > > > [root at kvm3 ~]# dmesg|tail -n 100 > > ... > > > umount(3958): WRITE block 100925440 on dm-5 > > > umount(3958): WRITE block 100925440 on dm-5 > > > umount(3958): WRITE block 100925440 on dm-5 > > > umount(3958): WRITE block 0 on dm-5 > > > umount(3958): dirtied inode 1053911 (mtab.tmp) on dm-0 > > > umount(3958): dirtied inode 1053911 (mtab.tmp) on dm-0 > > > umount(3958): WRITE block 33845632 on dm-0 > > > umount(3958): dirtied inode 1053912 (?) on dm-0 > > > > > > > > > Is the reason that I use protocol A? > > > > No. > > > > But that you need to understand caching, and tunables. > > > > Some hints and keywords for a followup search: > > > > Check how much "dirty" data (writes not yet on stable storage) > > is still in RAM: > > grep Dirty /proc/meminfo > > > > Tune how much dirty data is "allowed" > > sysctl > > vm.dirty_background_bytes > > vm.dirty_bytes > > vm.dirty_writeback_centisecs > > vm.dirty_expire_centisecs > > > > also compare: > > time dd if=t1 of=t2 bs=1M; time sync > > time dd if=t1 of=t2 bs=1M conv=fsync > > > > > > > > > > -- > > : Lars Ellenberg > > : LINBIT | Your Way to High Availability > > : DRBD/HA support and consulting http://www.linbit.com > > > > DRBD(R) and LINBIT(R) are registered trademarks of LINBIT, Austria. > > __ > > please don't Cc me, but send to list -- I'm subscribed > > _______________________________________________ > > drbd-user mailing list > > drbd-user at lists.linbit.com > > http://lists.linbit.com/mailman/listinfo/drbd-user > > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed