Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
On 08/15/2012 03:54 AM, Phil Frost wrote: > On 08/14/2012 08:26 PM, Dennis Jacobfeuerborn wrote: >> Hi, >> now that I got the re-sync issue sorted by moving to 8.3 instead of 8.4 I'm >> investigating why I'm seeing an i/o load of 20% even when I run a fio >> sequential write test throttled to 5M/s. Once I disconnect the secondary >> node the i/o wait time decreases significantly so it seems the connection >> to the other node is the problem. >> I did a bandwidth test with iperf that shows 940Mbit/s and ping shows a >> latency of consistently 0.3ms so I can't really see an issue here. >> Any ideas what could be going on here. > > By I/O load, do you mean, for example, the "%util" number iostat reports > for the DRBD device? I've observed similar numbers in my production > environment, and I've decided it's a non-issue. > > I don't know the details of why that number is high for the DRBD device, > but I pretty much ignore it for anything but a single physical spindle, and > even then I consider it only a hint. The number is calculated from a > counter which is incremented for each millisecond that a block device queue > has a non-zero number of requests in it. iostat (and sar, and similar) > calculate %util by reading this counter, finding the change since the last > reading, then dividing that quantity by the number of ms elapsed since the > last reading. Thus if the queue always had something in it, %util is 100%. > > The problem is that always having something in the queue doesn't mean the > device is saturated. A RAID 0 device, for example, won't reach full > potential until at least as many requests as there are spindles are > pending. You can make %util reach 100% on this RAID 0 device by issuing one > request after another, but all but one spindle will be idle since there's > never more than one thing to do, and the RAID device as a whole isn't > saturated. The same is true (perhaps to a lesser extent) even of single > physical drives with NCQ enabled, or of SSDs or RAID 5 devices given writes > that don't write whole physical blocks/stripes, etc. > > I suspect a similar phenomenon is at work in DRBD. I'd guess (and this is > just a guess, I've never examined DRBD internals beyond what's in the > manual) that the unusually high %util is due to the activity log [1] or > perhaps some other housekeeping function. With a slow write, DRBD has time > to create and clean a hot extent for each write it receives. So for each > block actually written, maybe there are a handful of other writes to the > activity log that are housekeeping overhead, which works out to something > absurd like 500%, causing your high %util. Once you start giving DRBD more > real write requests, however, all these writes can be batched into one > activity log transaction, so now these same handful of housekeeping writes > work out to a small overhead like 2%, and the unusually high %util vanishes. > > [1] http://www.drbd.org/users-guide/s-activity-log.html Thanks for the detailed explanation. I'm wondering though why something like this wouldn't be common knowledge and/or explained in the FAQ if this is a generic symptom of DRBD. Tomorrow I'm going to do some performance tests to see if this is a real problem or just a phantom issue. Regards, Dennis