[DRBD-user] Poor DRBD performance, HELP!

Noah Mehl noah at tritonlimited.com
Sun Jun 19 01:48:40 CEST 2011


On Jun 18, 2011, at 1:22 PM, Digimer wrote:

> On 06/17/2011 11:04 AM, Noah Mehl wrote:
>> Here is the drbd status from /proc/drbd:
>> 
>> version: 8.3.8.1 (api:88/proto:86-94)
>> GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@, 2011-05-21 19:18:16
>>  0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r---d
>>     ns:0 nr:0 dw:504388120 dr:2124 al:123972 bm:123845 lo:380 pe:0 ua:0 ap:381 ep:1 wo:b oos:13671456264
>>  1: cs:WFConnection ro:Secondary/Unknown ds:UpToDate/DUnknown C r----
>>     ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:13669326412
>> 
>> As you can see, I haven't connected the other node, so I'm assuming that the network has nothing to do with the performance at this point.
> 
> Is there a difference when your resource is StandAlone/Primary?
> 
> Of course, this is an artificial score, as you will be Connected 
> normally. What is your network capable of? try copying to/from a ramdisk 
> and see what speed you get (both alone and using ramdisk to back a DRBD 
> resource).
> 
> -- 
> Digimer
> E-Mail:              digimer at alteeve.com
> Freenode handle:     digimer
> Papers and Projects: http://alteeve.com
> Node Assassin:       http://nodeassassin.org
> "I feel confined, only free to expand myself within boundaries."

I wrote to Digimer by accident, without sending my response to the list:

I'm not familiar with standalone/primary, I will look that up in the user guide/google. At the moment the other node is not connected at all. It will eventually be connected using a dual 10Gb myri with lacp. I don't understand why you're asking about the network when I'm comparing the lower level physical device to the drbd resource on a single node. Can I not reliably test the disk performance without both nodes connected? Thanks for your help. 

In the mean time, I've run the benchmark again, with the Connection State as StandAlone and the Resource Role as Primary.  Here's the /proc/drbd:

version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by gardner@, 2011-05-21 19:18:16
 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:1141411844 dr:3340 al:279508 bm:279381 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:13671456264
 1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:13669326412

And the results:

16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 158.838 s, 433 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 154.165 s, 446 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 158.384 s, 434 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 155.494 s, 442 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 159.096 s, 432 MB/s
0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup 0 down' terminated with exit code 11
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 88.3911 s, 777 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 93.6837 s, 734 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 89.5447 s, 767 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 92.3562 s, 744 MB/s
16777216+0 records in
16777216+0 records out
68719476736 bytes (69 GB) copied, 91.5547 s, 751 MB/s

The last test the connection state was WFConnection, and the results ranged from 330 MB/s to 437 MB/s.  So I think I'm still in that range.  Still very far from the lower level device.

Also, I connected a single 10Gb between the two nodes and ran iperf:

[root at storageb ~]# iperf -c 10.0.100.241
------------------------------------------------------------
Client connecting to 10.0.100.241, TCP port 5001
TCP window size: 27.8 KByte (default)
------------------------------------------------------------
[  3] local 10.0.100.242 port 57982 connected with 10.0.100.241 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  11.5 GBytes  9.86 Gbits/sec


So, 9.86Gbps, not bad, I think that equates to: 1.2325 GBps.  So, should be enough for the 751 MB/s on the lower level device.  Again, I'm still not understanding why my lower level device does 750 MB/s and the drbd resource only does 430 MB/s.   Any thoughts?

~Noah



Scanned for viruses and content by the Tranet Spam Sentinel service.



More information about the drbd-user mailing list