<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
</head>
<body bgcolor="#ffffff" text="#000000">
Hi,<br>
<br>
I was trying today to play with drbd's settings and benchmark the
results in order to obtain the best performance.<br>
Here is my test setup:<br>
2 identical machines with sas storage boxes. Each machine has two 2TB
device (in my case /dev/sdb and /dev/sdc) that I mirror over drbd and
on top of them there's LVM set up. The nodes share a gbit link
dedicated for drbd traffic. After the initial sync which took something
around 20 hours to finish, I've created the LVM volume and formatted
using ext3 FS. Then I started to play around with params like
al-extents, unplug-watermark, maxbuffers, max-epoch by changing the
values and doing a drbdadm adjust all on each node (of course after
copying the config file accordingly). In the begining it went pretty
well, maximum value attained by dd test over drbd was 28.9 MB/s:<br>
<br>
[root@erebus testing]# dd if=/dev/zero of=test.dat bs=1G count=1
oflag=dsync<br>
1+0 records in<br>
1+0 records out<br>
1073741824 bytes (1.1 GB) copied, 37.1114 seconds, 28.9 MB/s<br>
<br>
The configuration used is described in the end. After a couple more
tests, I noticed a big impact on performance, getting around 19-20 MB/s
so I checked /proc/drbd to see what's going on. Surprisingly, it was
doing a full resync on one of the disks. Problem is, I don't understand
why, as normally it should only resync discrepancies. <br>
<br>
Output from /proc/drbd is as follows:<br>
<br>
<blockquote><i>version: 8.2.6 (api:88/proto:86-88)</i><br>
<i>GIT-hash: 3e69822d3bb4920a8c1bfdf7d647169eba7d2eb4 build by
<a class="moz-txt-link-abbreviated" href="mailto:root@leviathan.nl.imc.local">root@leviathan.nl.imc.local</a>, 2008-06-23 11:34:01</i><br>
<i> 0: cs:SyncTarget st:Secondary/Primary ds:Inconsistent/UpToDate C
r---</i><br>
<i> ns:0 nr:184932480 dw:239670288 dr:2127855509 al:36724
bm:142013 lo:30 pe:235 ua:29 ap:0 oos:1952003580</i><br>
<i> [>...................] sync'ed: 8.6% (1906253/2084799)M</i><br>
<i> finish: 8:54:23 speed: 60,812 (53,284) K/sec</i><br>
<i> 1: cs:Connected st:Secondary/Primary ds:UpToDate/UpToDate C r---</i><br>
<i> ns:0 nr:0 dw:33488348 dr:2134854516 al:31586 bm:130427 lo:0
pe:0 ua:0 ap:0 oos:0</i><br>
</blockquote>
<br>
Here is my drbd.conf file (the version I've got the best result with,
28.9 MB/s):<br>
<br>
<blockquote><i>global { <br>
usage-count no; <br>
}<br>
<br>
common {<br>
protocol C;<br>
syncer {<br>
rate 110M;<br>
}<br>
}<br>
<br>
resource drbd0 {<br>
on leviathan {<br>
device /dev/drbd0;<br>
disk /dev/sdb;<br>
address 10.0.0.10:7789;<br>
meta-disk internal;<br>
}<br>
on erebus {<br>
device /dev/drbd0;<br>
disk /dev/sdb;<br>
address 10.0.0.20:7789;<br>
meta-disk internal;<br>
}<br>
syncer {<br>
rate 110M;<br>
al-extents 641;<br>
}<br>
net {<br>
#on-disconnect reconnect;<br>
after-sb-0pri disconnect;<br>
after-sb-1pri disconnect;<br>
max-epoch-size 8192;<br>
max-buffers 8192;<br>
unplug-watermark 128;<br>
}<br>
}<br>
<br>
resource drbd1 {<br>
on leviathan {<br>
device /dev/drbd1;<br>
disk /dev/sdc;<br>
address 10.0.0.10:7790;<br>
meta-disk internal;<br>
}<br>
on erebus {<br>
device /dev/drbd1;<br>
disk /dev/sdc;<br>
address 10.0.0.20:7790;<br>
meta-disk internal;<br>
}<br>
syncer {<br>
rate 110M;<br>
al-extents 641;<br>
}<br>
net {<br>
#on-disconnect reconnect;<br>
after-sb-0pri disconnect;<br>
after-sb-1pri disconnect;<br>
max-epoch-size 8192;<br>
max-buffers 8192;<br>
unplug-watermark 128;<br>
}<br>
}</i><br>
</blockquote>
Anyone has any ideea what caused the full resync and how I can avoid it
in the future?<br>
<br>
Thanks and regards,<br>
<br>
Andrei Neagoe.<br>
<br>
</body>
</html>