[DRBD-user] monitor and graph "data transfer rate" [howto benchmark using ./dm]

Vampire D vampired at gmail.com
Thu Aug 10 04:42:47 CEST 2006

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


wow, this is great info, thanks!

On 8/9/06, Lars Ellenberg <Lars.Ellenberg at linbit.com> wrote:
>
> / 2006-08-09 17:50:26 -0400
> \ Vampire D:
> > I was looking for something like this, but super easy query or command.
> > Like Monty said, since it is block level most transactions may not
> "ramp" up
> > enough to really show real throughput, but some way of "testing" what
> the
> > devices are able to keep up would go a long way for benchmarking,
> scaling,
> > sizing, and making sure everything keeps running "up to snuff".
>
> in the drbd tarball in the benchmark subdir,
> we have a very simple tool called dm
> (I don't remember why those letters).
> you can also get it from
> http://svn.drbd.org/drbd/trunk/benchmark/dm.c
> compile it: gcc -O2 -o dm dm.c
>
> it is basically some variation on the "dd" tool,
> but you can switch on "progress" and "throughput" output,
> and you can switch on fsync() before close.
> it just does sequential io.
>
> to benchmark WRITE throughput,
> you use it like this:
> ./dm -a 0 -b 1M -s 500M -y -m -p -o $out_file_or_device
>
> this will print lots of 'R's (requested by "-m", 500 of them to be
> exact, one for each "block" (-b) up to the requested "size" (-s)).
> the first of those Rs will print very fast, if you request several Gig
> you will see it "hang" for a short time every few "R"s, and finally it
> will hang for quite a while (thats the fsync requested by -y).
> finally it will tell you the overall throughput.
>
> if you leave off the fsync (-y), you will get very fast writes, as long
> as they fit in some of the involved caches... this would be the
> "virtual" throughput seen by most processes which don't use fsync.
> but these are not very useful to figure out bottlenecks in the drbd
> configuration and general setup.
>
> you can tell dm where to write its data: use "-l 378G", and it will (try
> to) seek 378G into the device (file would probably result in a sparse
> file, which is not of particular interest). so if you have one disk of
> 400G, and have one partition on it using 400G, you could benchmark the
> "inner" 10G, and the "outer" 10G by using different offsets here.
>
> you will notice that the throughput differs significantly when using
> inner or outer cylinders of your disks.
>
> example run with "-b 1M -s 2M":
>        RR
>        10.48 MB/sec (2097152 B / 00:00.190802)
>
> if you don't like the "R"s, leave off the -m ...
>
> to measure local io bandwidth, you can use it directly on the lower
> level device (or an equivalent dummy partition).
>        !!this is destructive!!
>        !!you will have to recreate a file system on that thing!!
> ./dm -a 0 -b 1M -s 500M -y -m -p -o /dev/vg00/dummy
>
> to measure local io bandwidth including file system overhead:
> ./dm -a 0 -b 1M -s 500M -y -m -p -o /mnt/dummy/dummy-out
>
> to measure drbd performance in disconnected mode:
> drbdadm disconnect dummy
> ./dm -a 0 -b 1M -s 500M -y -m -p -o /dev/drbd9
>
> (be prepared for some additional latency,
> drbd housekeeping has to remember which
> blocks are dirty now...)
>
> ... in connected mode
> drbdadm connect dummy
> ./dm -a 0 -b 1M -s 500M -y -m -p -o /dev/drbd9
>
> still, the first write may be considerably slower than successive runs
> of the same command, since the activity log will be "hot" after the
> first one (as long as the size fits in the activity log completely)
>
> ... with file system
> mkfs.xfs /dev/drbd9 ; mount /dev/drbd9 /mnt/drbd9-mount-point
> ./dm -a 0 -b 1M -s 500M -y -m -p -o /mnt/drbd9-mount-point/dummy-out
>
> if you want to see the effect on power usage when writing 0xff instead
> of 0x00, use "-a 0xff" :)
>
> if you want to see the effect of the drbd activity log, use a size
> considerably larger than what you configured as al-extents.
>
> maybe you want to use "watch -n1 cat /proc/drbd" at the same time,
> so you can see the figures move, the pe go up, the lo go up sometimes,
> the ap go up, the dw and ns increase all the time, the al increasing not
> too often, finally the pe, lo, and ap fall back to zero...
>
> if you like, you could use
> watch -n1 "cat /proc/drbd ; netstat -tn | grep -e ^Proto -e ':7788\>'"
> which would also show you the drbd socket buffer usage, in case 7788 is
> your drbd port.  if you are curious, you should run this on both nodes.
>
> to see the effect of resync on that, you could invalidate one node
> (cause a full sync), and benchmark again.
> then play with the sync rate parameter.
>
> to be somewhat more reliable,
> you should repeat each command several times.
>
> to benchmark READ throughput, you use
> ./dm -o /dev/null -b 1M -s 500M -m -p -i /dev/sdx
> ./dm -o /dev/null -b 1M -s 500M -m -p -i /dev/drbd9
> be careful: you'll need to use a size _considerably_ larger than your
> RAM, or you'll see the linux caching effects on the second usage.
> of course, you could also "shrink" the caches first.
> to do so, since 2.6.16, you can
> echo 3 > /proc/sys/vm/drop_caches
> to get clean read throughput results.
> before that, you can allocate and use huge amounts of memory, like this:
> perl -e '$x = "X" x (1024*1024*500)'
> # would allocate and use about 1 GB, it uses about twice as much as you
> # say in those brackets... use as much as you got RAM (as long as you
> # have some swap available) and the caches will shrink :)
>
> or, even easier: you can just seek into the input device to some area
> where it is unlikely to have been read before:
> ./dm -o /dev/null -b 1M -s 500M -m -p -k 7G -i /dev/drbd9
> "-k 17G" makes it seek 17 gig into the given input "file".
>
> you will notice here, too, that read performance varies considerably
> with the "inner" and "outer" cylinders.
> this can be as gross as 50MB/sec inner and 30MB/sec outer.
>
>
> you can also benchmark network throughput with dm,
> if you utilize netcat. e.g.,
> me at x# nc -l -p 54321 -q0 >/dev/null
> me at y# dm -a 0 -b 1M -s 500M -m -p -y | nc x 54321 -q0
> two of them in reverse directions to see if your full duplex GigE does
> what you think it should ...
>
> ...
>
> you got the idea.
>
> at least this is what we use to track down problems at customer clusters.
> maybe sometime we script something around that, but most of the time we
> like the flexibility of using the tool directly.
> actually, we most of the time use rather a "data set size" of 800M to
> 6G...
>
> but be prepared for a sligh degradation once you cross the size of the
> activity log (al-extents parameter), as then drbd has to do synchronouse
> updates to its meta data area for every additional 4M.
>
>
> --
> : Lars Ellenberg                                  Tel +43-1-8178292-0  :
> : LINBIT Information Technologies GmbH            Fax +43-1-8178292-82 :
> : Schoenbrunner Str. 244, A-1120 Vienna/Europe   http://www.linbit.com :
> __
> please use the "List-Reply" function of your email client.
> _______________________________________________
> drbd-user mailing list
> drbd-user at lists.linbit.com
> http://lists.linbit.com/mailman/listinfo/drbd-user
>



-- 
"Do the actors on Unsolved Mysteries ever get arrested because they look
just like the criminal they are playing?"

Christopher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20060809/dc529da6/attachment.htm>


More information about the drbd-user mailing list