<div class="gmail_quote">On Fri, May 1, 2009 at 1:16 PM, Gennadiy Nerubayev <span dir="ltr"><<a href="mailto:parakie@gmail.com">parakie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="gmail_quote"><div class="im">On Fri, May 1, 2009 at 10:34 AM, Lars Ellenberg <span dir="ltr"><<a href="mailto:lars.ellenberg@linbit.com" target="_blank">lars.ellenberg@linbit.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
what is your micro benchmark?</blockquote></div><div><br>iometer.. which is not particularly micro :p <br><br></div><div class="im"><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
for sequential write throughput micro benchmark,<br>
I suggest<br>
<br>
dd if=/dev/zero of=/dev/drbdX bs=4M count=1000 oflag=direct<br>
<br>
do variations in bs= and count= (to reveal possibly issues<br>
with cpu cache sizes).<br>
<br>
also do variations of<br>
oflag=direct<br>
no page cache/buffer cache involved,<br>
oflag=dsync<br>
completely through buffer cache/page cache,<br>
but does the equivalent of "fsync" for every "bs"<br>
no oflag, but conv=fsync<br>
completely through buffer cache/page cache,<br>
and does a real fsync only once all count * bs<br>
blocks are written<br>
<br>
smalish bs (< the size of your cpu cache), say bs=32k, high count,<br>
and oflag=direct is what is most like what the resync is doing.</blockquote></div><div><br>I did a number of dd runs; the results are attached. The 32k direct writes are the worst when connected.<br></div><div class="im">
<div> <br></div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
you can also start pinning your "dd" to a single cpu,<br>
preferably the same your DRBD kernel threads are running on.<br>
or allow only the first two cores, or whatever.</blockquote></div><div><br>The cpu (single) in both boxes is dual non-hyperthreaded core. I can repeat the benchmarks on one core - would passing nosmp and maxcpus=1 to the kernel be sufficient for this test case?</div>
</div></blockquote><div><br>Woops, would be nice if I actually attached a file.<br><br>-Gennadiy <br></div></div><br>