Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello, We notice a performance issue when writing to drbd devices on our cluster. Write throughput averages at about 12MB/s with disks and network both being able to deliver more than 100MB/s. The setup in brief: - two nodes, reasonably powerful hardware - internal RAID controller with RAID-10 configured - direct 1Gbit network interlink used only for syncing drbd devices - Ubuntu 12.04 LTS server, Kernel 3.2.0-53, DRBD 8.3.11 - synchronous replication protocol used for drbd devices - no fancy drbd tunables set (e.g. "rate") The storage stack on each node looks like this: - Physical disks -> RAID-10 -> LVM2 logical volume -> drbd device - We tested the network connection simply with netcat achieving the expected 110MB/s throughput of a proper 1Gbit interlink. - We checked the write througput of the backend storage on each node by dd'ing data to a logical volume and achieved writing speeds at about 250MB/s on average. Hence, drbd should perform quite well and should use up all the bandwidth the 1Gbit network interlink can deliver. Well, unfortunately, it does not - we are stalled at about 12MB/s. So something along the chain: Node A phys. disks -> RAID-10 -> LVM2 logical volume -> drbd device -> network interlink -> drbd device -> LVM2 logical volume -> RAID-10 -> node B phys. disks is slow. Then we did another test. We disconnected the drbd device and dd'ed some data to it. That was unexpectedly slow. After the dd finished (average throughput 12MB/s), we connected the drbd to its peer again and it synced with 100MB/s over the interlink using all the bandwidth there is. TL;DR: - local logical volume -> local logical volume: FAST writing speeds - local LVM2 logical volume -> local drbd device: SLOW writing speeds Any ideas what could eat up our performance while writing to a local drbd device? Thanks Matthias