Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Lars, thans for your reply.
> > incompatibility with the raid card driver?!
>
> Did you tell us which driver that would be?
Of course. These are LSI Megaraid SAS Cards of the type 9280 (8e and 4i4e)
two of them in each node.
# modinfo megaraid_sas
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.06.12-rc1
this is the primary
and this the secondary
description: LSI MegaRAID SAS Driver
author: megaraidlinux at lsi.com
version: 00.00.06.12-rc1
> What does your device stack look like?
It looks like the following
Sata / SAS harddisks (we are running some raid sets with sas and some with
sata) => LSI 9280 (raid 5/6) => drbd (on top of a guid partition or on top
of a complete LD) => lvm2 => scst
> DRBD does no request merging.
That is interesting. I do not know if I interpret the iostat values
correctly. I assumed that the tps shown for a device reflect how many i/os
it has to handle and as drbd is handling much more tps on the primary as the
backing disk is I assumed drbd would do the merging.
> If coming from the page (or buffer) cache, IO is simply 4k.
> (That's all normal file IO, unless using direct io).
> Those are the requests that DRBD sees in its make_request function.
> That's just the way it is.
Yes, we were talking about file i/o here. I already found out that block i/o
(or O_DIRECT) is issuing larger i/os and performing much better in my setup.
We were using file i/o for our scsi target here because we wanted to use our
ram for read caching.
> It is the lower level device's IO scheduler ("elevator") that does the
> aggregation/merging of these requests before submitting to the backend
> "real" device.
Mmmh okay... but why is it merging requests on the primary and not for the
same (replicated) requests on the secondary. That sounds strage for me- I
tried accessing the block device on the secondary directly too (without
O_DIRECT) and I don't see any 4k i/o here.
> > BUT lvm2 itself
>
> No, I don't think device mapper does such thing.
> Well, "request based" device mapper targets will do that.
> not sure if you actually use those, though. Do you?
Mh I must admit I don't know if it operates request-based. It is just the
standard lvm2 package from the debian squeeze repos.
> But obviously *something* is different now, which allows the lower
> level IO scheduler to merge things.
Yes of course. SOMETHING is different. But I can't tell what from all my
tests. As I said interestingly it does not matter which oft he servers if
primary for a drbd device I tried swapping that and the primary does ALWAYS
merge. The secondary NEVER - so I must assume that the megasas driver is
capable of doing this on both nodes. (In my test setup primary does as well
as secondary - cannot reproduce the problem here)
Regards, Felix