[DRBD-user] High io-wait

Thu Aug 16 04:26:47 CEST 2012

On 08/15/2012 03:54 AM, Phil Frost wrote:
> On 08/14/2012 08:26 PM, Dennis Jacobfeuerborn wrote:
>> Hi,
>> now that I got the re-sync issue sorted by moving to 8.3 instead of 8.4 I'm
>> investigating why I'm seeing an i/o load of 20% even when I run a fio
>> sequential write test throttled to 5M/s. Once I disconnect the secondary
>> node the i/o wait time decreases significantly so it seems the connection
>> to the other node is the problem.
>> I did a bandwidth test with iperf that shows 940Mbit/s and ping shows a
>> latency of consistently 0.3ms so I can't really see an issue here.
>> Any ideas what could be going on here.
> 
> By I/O load, do you mean, for example, the "%util" number iostat reports
> for the DRBD device? I've observed similar numbers in my production
> environment, and I've decided it's a non-issue.
> 
> I don't know the details of why that number is high for the DRBD device,
> but I pretty much ignore it for anything but a single physical spindle, and
> even then I consider it only a hint. The number is calculated from a
> counter which is incremented for each millisecond that a block device queue
> has a non-zero number of requests in it. iostat (and sar, and similar)
> calculate %util by reading this counter, finding the change since the last
> reading, then dividing that quantity by the number of ms elapsed since the
> last reading. Thus if the queue always had something in it, %util is 100%.
> 
> The problem is that always having something in the queue doesn't mean the
> device is saturated. A RAID 0 device, for example, won't reach full
> potential until at least as many requests as there are spindles are
> pending. You can make %util reach 100% on this RAID 0 device by issuing one
> request after another, but all but one spindle will be idle since there's
> never more than one thing to do, and the RAID device as a whole isn't
> saturated. The same is true (perhaps to a lesser extent) even of single
> physical drives with NCQ enabled, or of SSDs or RAID 5 devices given writes
> that don't write whole physical blocks/stripes, etc.
> 
> I suspect a similar phenomenon is at work in DRBD. I'd guess (and this is
> just a guess, I've never examined DRBD internals beyond what's in the
> manual) that the unusually high %util is due to the activity log [1] or
> perhaps some other housekeeping function. With a slow write, DRBD has time
> to create and clean a hot extent for each write it receives. So for each
> block actually written, maybe there are a handful of other writes to the
> activity log that are housekeeping overhead, which works out to something
> absurd like 500%, causing your high %util. Once you start giving DRBD more
> real write requests, however, all these writes can be batched into one
> activity log transaction, so now these same handful of housekeeping writes
> work out to a small overhead like 2%, and the unusually high %util vanishes.
> 
> [1] http://www.drbd.org/users-guide/s-activity-log.html

Thanks for the detailed explanation. I'm wondering though why something
like this wouldn't be common knowledge and/or explained in the FAQ if this
is a generic symptom of DRBD. Tomorrow I'm going to do some performance
tests to see if this is a real problem or just a phantom issue.

Regards,
  Dennis