Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, On Thu, 12 Jul 2012 11:36:28 +0200 Phil Stricker wrote: > Hi! > > I want to setup another Xenserver-pair with drbd as its block device. > Now I have to chose the CPU. Is DRBD able to use multiple CPU cores? > > The servers will own two replicated arrays and 1 10 GBit/s replication > interface: > - array 1: 8x SSD What kind of controller and RAID level? > - array 2: 2x SAS > > Is one of the 8 cores of an Intel Sandy-Bridge 2,4 GHz CPU enough for > that drive setup, or would one core of an 4-Core 3,30 Sandy-Bridge CPU > better? Given that you're building a VM server (how many VMs?), the more cores the better, of course. And yes, DRBD will use more than one core and normally you should be fine with that 8 core setup. That said here's an example of DRBD being CPU bound: Dual Opteron 4130 (4 core each, 2.6GHz), 4xQDR Infiniband replication (32Gb/s, but IP over Infiniband tops out around 10Gb/s). On recent (4 years?, since socket F) Opteron machines the HW interrupts get preferentially handled by CPU 0 (in this case physical CPU 1, core 1), so I try to place things that deal with such things on the same physical CPU. Thus in the drbd configuration there is a "cpu-mask f;" entry for these boxes. Kernel 3.2.20, DRBD 8.3.13. Now when doing a mkfs.ext4 on drbd1 (which is primary on the other, remote node), this is what we get on the other node (secondary, receiver) in top (column P indicates the logical CPU): --- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 1503 root 20 0 0 0 0 R 99 0.0 0:25.88 0 drbd1_receiver 1516 root 20 0 0 0 0 S 53 0.0 0:14.19 1 drbd1_asender 3 root 20 0 0 0 0 S 3 0.0 0:00.91 0 ksoftirqd/0 --- and ethstats respectively: --- ib0: 2032.99 Mb/s In 21.08 Mb/s Out - 12340.3 p/s In 11975.7 p/s Out --- While obviously using more than one core and thus at least saving us some bacon here, it is also painfully clear that whatever mkfs is doing here is driving the drbd receiver process up the wall. A faster CPU would likely speed things up. Same setup, running bonnie (intelligent/fast write) on the remote resource (drbd1): --- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 1503 root 20 0 0 0 0 R 52 0.0 10:49.68 0 drbd1_receiver 1516 root 20 0 0 0 0 S 4 0.0 5:41.64 1 drbd1_asender ib0: 3098.42 Mb/s In 3.31 Mb/s Out - 9213.0 p/s In 5704.0 p/s Out --- Clearly the intelligent write is easier on things and achieves a 30% higher speed. This is writing at full speed, on a very busy mailbox server (identical except for the RAID controller to these machines) the drbd processes never go over 5%, the number of I/O operations and not raw throughput are limiting things there long before drbd can get busy. As an aside, this gives me write speeds up to 350MB/s on the DRBD resource, however the backing device (7 disks RAID6 on Adaptec RAID 51645) does give me write speeds up to 650MB/s. Now on the aforementioned identical cluster the only difference is the RAID controller (Areca ARC-1880-ix-16) and a 6 disks RAID6. This gives me about 460MB/s write speeds on either the backing device or DRBD... Clearly something is not as identical, the replication link clocks in at the same speeds with NPtcp, so my first guess is to glare at something like unplug-watermark. Lastly this is with a bonnie running on both resources in parallel (drbd0 local, drbd1 remote). You can see the drbd processes nicely spread out, but remaining on the first physical CPU. And bonnie being relegated to the other physical CPU. ^o^ --- PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND 1503 root 20 0 0 0 0 R 50 0.0 15:42.09 0 drbd1_receiver 5101 root 20 0 22388 1320 1108 D 30 0.0 0:13.15 6 bonnie++ 1455 root 20 0 0 0 0 D 24 0.0 0:07.32 3 drbd0_worker 5072 root 20 0 0 0 0 D 14 0.0 0:01.78 5 jbd2/drbd0-8 5077 root 20 0 0 0 0 D 13 0.0 0:03.82 4 flush-147:0 1517 root 20 0 0 0 0 S 9 0.0 0:02.61 1 drbd0_asender 1516 root 20 0 0 0 0 R 4 0.0 6:07.92 1 drbd1_asender 1497 root 20 0 0 0 0 S 2 0.0 0:01.11 0 drbd0_receiver ---- Regards, Christian -- Christian Balzer Network/Systems Engineer chibi at gol.com Global OnLine Japan/Fusion Communications http://www.gol.com/