[DRBD-user] Problems with drbd performance with Xen

Igor Neves igor at 3gnt.net
Fri Feb 20 13:19:56 CET 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.



Igor Neves wrote:
> Hi,
>
> Lars Ellenberg wrote:
>> On Mon, Feb 16, 2009 at 10:51:45AM +0000, Igor Neves wrote:
>>   
>>> Hi,
>>>
>>> Some of you may know my problem since I have been discussing it with you
>>> in the IRC in the last few days, but I will explain it.
>>>
>>> I'm doing some tests with drbd with xen virtual machines, our solution
>>> was with vmware and drbd, but this solution it's with very low
>>> performance compared to xen, and i'm trying to improve our solution to xen.
>>>
>>> I started my tests with xen on both machines (node1 & node2). Both
>>> machines have 3 disks, sdb it's the drbd0.
>>>
>>> The disk can only deliver 102MB/sec writing, following seagate, and I
>>> get about 105MB/sec from each disk in direct access.
>>> I started my resouce in drbd and i'm able to sync at about 95MB/sec,
>>> which it's very good. Testing the disk with dd or with your benchmark
>>> tool called md, I get about 85MB/sec writing, which it's not bad.
>>>
>>> Now when I start the Xen Windows HVM Virtual Machine on node1 and I give
>>> "/dev/drbd0" as a block device, it works great, but inside windows VM, I
>>> get:
>>> - With drbd rpm package 8.2.6 from redhat and with this configuration
>>> "al-extents 1801" i get about 45MB/sec writing inside VM.
>>> - With drbd rpm 8.2.7 and 8.3.0 from linbit and with this configuration
>>> "al_extents 1801" i get about 4MB/sec writing inside VM.
>>> - With source 8.2.7 and 8.3.0 from linbit and with "al_extents 1801" i
>>> get again 4MB/sec writing inside VM.
>>>
>>> Now what I have found it's, all this tests were made with drbd in
>>> connected & updated state. In the middle of the test if I do in node2
>>> "drbdadm disconnect resource" I get the state WForconnection and Update
>>> on node1 and in that moment I start getting 85MB/sec.
>>>
>>> So inside Xen Windows HVM Virtual Machine in and with drbd under it as a
>>> block device i get:
>>> - drbd  8.2.6 from redhat and drbd connected = 45MB/sec, with drbd
>>> disconnected = 85MB/sec
>>> - drbd 8.2.7 from linbit and drbd connected = 4MB/sec, with drbd
>>> disconnected = 85MB/sec
>>> - drbd 8.3.0 from linbit and drbd connected = 4MB/sec, with drbd
>>> disconnected = 85MB/sec
>>>     
>>
>> in 8.2.7, we introduced a new method to ensure write ordering
>> guarantees, which now needs to be explicitly disabled
>> if you live on a "safe" device (battery backed write cache,
>> or no cache at all), where this hurts performance.
>>
>> so where you had "no-disk-flushes" in drbd.conf before,
>> you now need an additional "no-disk-barrier".
>>
>> if that does not help, or if you did not have "no-disk-flushes" before,
>> let me know, and I try to come up with something else.
>>   
>
> Ok, I have added "no-disk-barrier" and now I get about 50MB/sec inside
> the VM.
>
> Outside the VM, in host, I get about 85MB/sec and with resource
> disconnected I get 85MB/sec.
>
> It's there something we can do to get this 25MB/sec difference?

I think i have found the problem, and it's related to bonding.

For testing this correctly, I have setup the full disk on the HBA, and
now I have /dev/sda that can do this:
Reading: ~568 MB/s
Writing: ~472 MB/s

I setup drbd0 to attach to this disk and with resource disconnected I
get about ~260 MB/s writing (still far from the backend device).
With drbd0 connected using bond0 for replication, I get 60MB/sec.

My question is, what is happening?

PS: bond0 was tested with iperf, and I get 1.8Gbit transfer between
nodes, same link drbd uses.

>
>>   
>>> I know you can ask, it's not Xen block device driver?
>>> - Yes it could, so i have added the other disk  (exacly equal to the one
>>> in drbd resource and with the same throughput) with direct access and
>>> inside THE SAME windows virtual machine, I have 100MB/sec writing to the
>>> disk, so the Xen block device it's going good and it's working like a
>>> charm. Besides that Xen block device does not know "connected" and
>>> "disconnected" as I explained in the early example.
>>>
>>> I have done this tests all again in node2 as Brian suggested, but i get
>>> exactly the same values in the tests.
>>>
>>> So there is a problem here with drbd<->xen, something I have not managed
>>> to find and could be a bug or maybe some bad configuration from my side.
>>>
>>> Please, someone with more experience help out on this.
>>>
>>> My kernel version it's "2.6.18-92.1.22.el5xen" and I'm with Centos 5 and
>>> with all the latest updates available.
>>>
>>> Thanks,
>>> Igor Neves
>>>     
>
> Thanks,
-- 
Igor Neves <igor.neves at 3gnt.net>
3GNTW - Tecnologias de Informação, Lda
 
 SIP: igor at 3gnt.net	JID: igor at 3gnt.net 
 ICQ: 249075444		MSN: igor at 3gnt.net
 TLM: 00351914503611	PSTN: 00351252377120


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linbit.com/pipermail/drbd-user/attachments/20090220/fa95cccd/attachment.htm>


More information about the drbd-user mailing list