Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Jan Kellermann ha scritto: > Hi, > > we have 2 servers on centos5.2 in a cluster with the redhat cman in the > configuration > > SERVER1 > LVM \ > CMAN DRBD > XEN > SERVER2 > LVM / > > For each xen we create a new local LVM on each node, put them in a drbd > and install an os (debian 4 or ubuntu 8). The xens are running as pvm. > > Everything works fine since over 4 months now. > > But we have some performance-Problems: > > Writing on the Xen-devices produces an iowait about 50 to 60% on the > xen-processors. > > we have 3 szenarios tested: > > a) Xen on DRBD > b) Xen on DRBD but disconnected > c) direct mounted DRBD > > You can see the difference in Write pro Char and Write per Block. > > a: 3792 / 4011 > b: 52037 / 103777 > c: 57135 / 325002 > > See bonnie_result.txt for more data. > > We attached our drbd.conf and a xen-config for your information. > > The Server are each 2xDual-Core AMD Opteron(tm) Processor 2214 HE with > 32 Gb RAM and 2TB-Harddrive on Raid5 running von CentOS 5.2. The > XEN-DomUs are Debian 4 or Ubuntu 8. The NICs are bonded Intel 1GBit. For > the DRBD we have an own connection on a seperately subnet. > > Though we are wondering what there happens. Any Idea? > > And: YES, we tried a some configurations in the last past months before > we are asking you today :) > > Best regards an Thank you in advance > Jan > > ------------------------------------------------------------------------ > > ================================================================================= > > XEN ON DRBD CONNECTED > > Version 1.03b ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > ubuntu64 2G 3792 6 4011 0 3170 0 33006 40 182491 6 231.1 0 > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ > > > ================================================================================= > > XEN ON DRBD DISCONNECTED > > Version 1.03b ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > ubuntu64 2G 52037 85 103777 26 43139 5 33510 41 174650 5 212.8 0 > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ > ubuntu64,2G,52037,85,103777,26,43139,5,33510,41,174650,5,212.8,0,16,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++ > > ================================================================================= > > > DRBD CONNECTED W/O XEN > > Version 1.03 ------Sequential Output------ --Sequential Input- --Random- > -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- > Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP > server101 2G 57135 97 325002 94 16498 3 61150 94 1166962 100 +++++ +++ > ------Sequential Create------ --------Random Create-------- > -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete-- > files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP > 16 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ > server101,2G,57135,97,325002,94,16498,3,61150,94,1166962,100,+++++,+++,16,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++ > ------------------------------------------------------------------------ > > resource server206 { > protocol C; > > handlers { > pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; > pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; > local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; > } > > startup { > degr-wfc-timeout 120; # 2 minutes. > wait-after-sb; > } > > disk { > on-io-error detach; > } > > net { > allow-two-primaries; > after-sb-0pri discard-least-changes; > after-sb-1pri violently-as0p; > after-sb-2pri violently-as0p; > rr-conflict violently; > max-buffers 16384; > max-epoch-size 16384; > sndbuf-size 1M; > > } > > syncer { > rate 100M; > al-extents 3313; > } > > on server101.werk21system.de { > device /dev/drbd6; > disk /dev/xendisk/server206; > address 10.20.0.101:7795; > meta-disk internal; > > } > > on server102.werk21system.de { > device /dev/drbd6; > disk /dev/xendisk/server206 ; > address 10.20.0.102:7795; > meta-disk internal; > } > } > > ------------------------------------------------------------------------ > > > ## Ubuntu 64Bit Kernel > kernel = '/etc/xen/kernel/vmlinuz-2.6.24-19-xen' > ramdisk = '/etc/xen/kernel/initrd.img-2.6.24-19-xen' > > memory = '1024' > root = '/dev/xvda ro' > disk = [ 'drbd:server206,xvda,w' ] > name = 'server206' > vcpus = 2 > vif = [ 'ip=X.Y.Z.205,bridge=xenbr0','ip=10.10.10.205,bridge=xenbr1' ] > vfb = [ 'type=vnc,vncunused=1,keymap=de' ] > keymap = 'de' > on_poweroff = 'destroy' > on_reboot = 'restart' > on_crash = 'restart' > > ------------------------------------------------------------------------ > > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > I experienced high iowait with a pair of Dell 2900, but I determine there was no configuration or network problems. You have to improve I/O access all over your system. Take in mind you have to sum all seek time of all I/O access of primary and secondary system. A drbd I/O is made of 4 real I/O: two on primary system to write the metadata and the actual data and two on the secondary system for the same reason. Plus you have to take in mind that using md driver, every write need at least two read to compute the newly parity data. If we get the average seek time of 9 ms for an average disk, you get a delay of 8 x 9 = 72 ms for every write. It is less than 15 writes for every second. If you want more precise and worst data, take in count also the network latency and the transfer rate of disks... My advices: get the fastest disks you can buy, offload to a raid controller (with battery backed cache ram) the write and get a lots of ram to cache as many read you can. Leandro