Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello,
On Thu, 12 Jul 2012 11:36:28 +0200 Phil Stricker wrote:
> Hi!
>
> I want to setup another Xenserver-pair with drbd as its block device.
> Now I have to chose the CPU. Is DRBD able to use multiple CPU cores?
>
> The servers will own two replicated arrays and 1 10 GBit/s replication
> interface:
> - array 1: 8x SSD
What kind of controller and RAID level?
> - array 2: 2x SAS
>
> Is one of the 8 cores of an Intel Sandy-Bridge 2,4 GHz CPU enough for
> that drive setup, or would one core of an 4-Core 3,30 Sandy-Bridge CPU
> better?
Given that you're building a VM server (how many VMs?), the more cores the
better, of course.
And yes, DRBD will use more than one core and normally you should be fine
with that 8 core setup.
That said here's an example of DRBD being CPU bound:
Dual Opteron 4130 (4 core each, 2.6GHz), 4xQDR Infiniband replication
(32Gb/s, but IP over Infiniband tops out around 10Gb/s).
On recent (4 years?, since socket F) Opteron machines the HW interrupts get
preferentially handled by CPU 0 (in this case physical CPU 1, core 1), so
I try to place things that deal with such things on the same physical CPU.
Thus in the drbd configuration there is a "cpu-mask f;" entry for these
boxes.
Kernel 3.2.20, DRBD 8.3.13.
Now when doing a mkfs.ext4 on drbd1 (which is primary on the other,
remote node), this is what we get on the other node (secondary, receiver)
in top (column P indicates the logical CPU):
---
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
1503 root 20 0 0 0 0 R 99 0.0 0:25.88 0 drbd1_receiver
1516 root 20 0 0 0 0 S 53 0.0 0:14.19 1 drbd1_asender
3 root 20 0 0 0 0 S 3 0.0 0:00.91 0 ksoftirqd/0
---
and ethstats respectively:
---
ib0: 2032.99 Mb/s In 21.08 Mb/s Out - 12340.3 p/s In 11975.7 p/s Out
---
While obviously using more than one core and thus at least saving us some
bacon here, it is also painfully clear that whatever mkfs is doing here
is driving the drbd receiver process up the wall.
A faster CPU would likely speed things up.
Same setup, running bonnie (intelligent/fast write) on the remote
resource (drbd1):
---
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
1503 root 20 0 0 0 0 R 52 0.0 10:49.68 0 drbd1_receiver
1516 root 20 0 0 0 0 S 4 0.0 5:41.64 1 drbd1_asender
ib0: 3098.42 Mb/s In 3.31 Mb/s Out - 9213.0 p/s In 5704.0 p/s Out
---
Clearly the intelligent write is easier on things and achieves a 30% higher
speed. This is writing at full speed, on a very busy mailbox server
(identical except for the RAID controller to these machines) the drbd
processes never go over 5%, the number of I/O operations and not raw
throughput are limiting things there long before drbd can get busy.
As an aside, this gives me write speeds up to 350MB/s on the DRBD
resource, however the backing device (7 disks RAID6 on Adaptec RAID
51645) does give me write speeds up to 650MB/s.
Now on the aforementioned identical cluster the only difference is the
RAID controller (Areca ARC-1880-ix-16) and a 6 disks RAID6. This gives me
about 460MB/s write speeds on either the backing device or DRBD...
Clearly something is not as identical, the replication link clocks in at
the same speeds with NPtcp, so my first guess is to glare at something like
unplug-watermark.
Lastly this is with a bonnie running on both resources in parallel (drbd0
local, drbd1 remote). You can see the drbd processes nicely spread out,
but remaining on the first physical CPU.
And bonnie being relegated to the other physical CPU. ^o^
---
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
1503 root 20 0 0 0 0 R 50 0.0 15:42.09 0 drbd1_receiver
5101 root 20 0 22388 1320 1108 D 30 0.0 0:13.15 6 bonnie++
1455 root 20 0 0 0 0 D 24 0.0 0:07.32 3 drbd0_worker
5072 root 20 0 0 0 0 D 14 0.0 0:01.78 5 jbd2/drbd0-8
5077 root 20 0 0 0 0 D 13 0.0 0:03.82 4 flush-147:0
1517 root 20 0 0 0 0 S 9 0.0 0:02.61 1 drbd0_asender
1516 root 20 0 0 0 0 R 4 0.0 6:07.92 1 drbd1_asender
1497 root 20 0 0 0 0 S 2 0.0 0:01.11 0 drbd0_receiver
----
Regards,
Christian
--
Christian Balzer Network/Systems Engineer
chibi at gol.com Global OnLine Japan/Fusion Communications
http://www.gol.com/