Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
wow, this is great info, thanks! On 8/9/06, Lars Ellenberg <Lars.Ellenberg at linbit.com> wrote: > > / 2006-08-09 17:50:26 -0400 > \ Vampire D: > > I was looking for something like this, but super easy query or command. > > Like Monty said, since it is block level most transactions may not > "ramp" up > > enough to really show real throughput, but some way of "testing" what > the > > devices are able to keep up would go a long way for benchmarking, > scaling, > > sizing, and making sure everything keeps running "up to snuff". > > in the drbd tarball in the benchmark subdir, > we have a very simple tool called dm > (I don't remember why those letters). > you can also get it from > http://svn.drbd.org/drbd/trunk/benchmark/dm.c > compile it: gcc -O2 -o dm dm.c > > it is basically some variation on the "dd" tool, > but you can switch on "progress" and "throughput" output, > and you can switch on fsync() before close. > it just does sequential io. > > to benchmark WRITE throughput, > you use it like this: > ./dm -a 0 -b 1M -s 500M -y -m -p -o $out_file_or_device > > this will print lots of 'R's (requested by "-m", 500 of them to be > exact, one for each "block" (-b) up to the requested "size" (-s)). > the first of those Rs will print very fast, if you request several Gig > you will see it "hang" for a short time every few "R"s, and finally it > will hang for quite a while (thats the fsync requested by -y). > finally it will tell you the overall throughput. > > if you leave off the fsync (-y), you will get very fast writes, as long > as they fit in some of the involved caches... this would be the > "virtual" throughput seen by most processes which don't use fsync. > but these are not very useful to figure out bottlenecks in the drbd > configuration and general setup. > > you can tell dm where to write its data: use "-l 378G", and it will (try > to) seek 378G into the device (file would probably result in a sparse > file, which is not of particular interest). so if you have one disk of > 400G, and have one partition on it using 400G, you could benchmark the > "inner" 10G, and the "outer" 10G by using different offsets here. > > you will notice that the throughput differs significantly when using > inner or outer cylinders of your disks. > > example run with "-b 1M -s 2M": > RR > 10.48 MB/sec (2097152 B / 00:00.190802) > > if you don't like the "R"s, leave off the -m ... > > to measure local io bandwidth, you can use it directly on the lower > level device (or an equivalent dummy partition). > !!this is destructive!! > !!you will have to recreate a file system on that thing!! > ./dm -a 0 -b 1M -s 500M -y -m -p -o /dev/vg00/dummy > > to measure local io bandwidth including file system overhead: > ./dm -a 0 -b 1M -s 500M -y -m -p -o /mnt/dummy/dummy-out > > to measure drbd performance in disconnected mode: > drbdadm disconnect dummy > ./dm -a 0 -b 1M -s 500M -y -m -p -o /dev/drbd9 > > (be prepared for some additional latency, > drbd housekeeping has to remember which > blocks are dirty now...) > > ... in connected mode > drbdadm connect dummy > ./dm -a 0 -b 1M -s 500M -y -m -p -o /dev/drbd9 > > still, the first write may be considerably slower than successive runs > of the same command, since the activity log will be "hot" after the > first one (as long as the size fits in the activity log completely) > > ... with file system > mkfs.xfs /dev/drbd9 ; mount /dev/drbd9 /mnt/drbd9-mount-point > ./dm -a 0 -b 1M -s 500M -y -m -p -o /mnt/drbd9-mount-point/dummy-out > > if you want to see the effect on power usage when writing 0xff instead > of 0x00, use "-a 0xff" :) > > if you want to see the effect of the drbd activity log, use a size > considerably larger than what you configured as al-extents. > > maybe you want to use "watch -n1 cat /proc/drbd" at the same time, > so you can see the figures move, the pe go up, the lo go up sometimes, > the ap go up, the dw and ns increase all the time, the al increasing not > too often, finally the pe, lo, and ap fall back to zero... > > if you like, you could use > watch -n1 "cat /proc/drbd ; netstat -tn | grep -e ^Proto -e ':7788\>'" > which would also show you the drbd socket buffer usage, in case 7788 is > your drbd port. if you are curious, you should run this on both nodes. > > to see the effect of resync on that, you could invalidate one node > (cause a full sync), and benchmark again. > then play with the sync rate parameter. > > to be somewhat more reliable, > you should repeat each command several times. > > to benchmark READ throughput, you use > ./dm -o /dev/null -b 1M -s 500M -m -p -i /dev/sdx > ./dm -o /dev/null -b 1M -s 500M -m -p -i /dev/drbd9 > be careful: you'll need to use a size _considerably_ larger than your > RAM, or you'll see the linux caching effects on the second usage. > of course, you could also "shrink" the caches first. > to do so, since 2.6.16, you can > echo 3 > /proc/sys/vm/drop_caches > to get clean read throughput results. > before that, you can allocate and use huge amounts of memory, like this: > perl -e '$x = "X" x (1024*1024*500)' > # would allocate and use about 1 GB, it uses about twice as much as you > # say in those brackets... use as much as you got RAM (as long as you > # have some swap available) and the caches will shrink :) > > or, even easier: you can just seek into the input device to some area > where it is unlikely to have been read before: > ./dm -o /dev/null -b 1M -s 500M -m -p -k 7G -i /dev/drbd9 > "-k 17G" makes it seek 17 gig into the given input "file". > > you will notice here, too, that read performance varies considerably > with the "inner" and "outer" cylinders. > this can be as gross as 50MB/sec inner and 30MB/sec outer. > > > you can also benchmark network throughput with dm, > if you utilize netcat. e.g., > me at x# nc -l -p 54321 -q0 >/dev/null > me at y# dm -a 0 -b 1M -s 500M -m -p -y | nc x 54321 -q0 > two of them in reverse directions to see if your full duplex GigE does > what you think it should ... > > ... > > you got the idea. > > at least this is what we use to track down problems at customer clusters. > maybe sometime we script something around that, but most of the time we > like the flexibility of using the tool directly. > actually, we most of the time use rather a "data set size" of 800M to > 6G... > > but be prepared for a sligh degradation once you cross the size of the > activity log (al-extents parameter), as then drbd has to do synchronouse > updates to its meta data area for every additional 4M. > > > -- > : Lars Ellenberg Tel +43-1-8178292-0 : > : LINBIT Information Technologies GmbH Fax +43-1-8178292-82 : > : Schoenbrunner Str. 244, A-1120 Vienna/Europe http://www.linbit.com : > __ > please use the "List-Reply" function of your email client. > _______________________________________________ > drbd-user mailing list > drbd-user at lists.linbit.com > http://lists.linbit.com/mailman/listinfo/drbd-user > -- "Do the actors on Unsolved Mysteries ever get arrested because they look just like the criminal they are playing?" Christopher -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20060809/dc529da6/attachment.htm>