Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Hello Lars, thans for your reply. > > incompatibility with the raid card driver?! > > Did you tell us which driver that would be? Of course. These are LSI Megaraid SAS Cards of the type 9280 (8e and 4i4e) two of them in each node. # modinfo megaraid_sas description: LSI MegaRAID SAS Driver author: megaraidlinux at lsi.com version: 00.00.06.12-rc1 this is the primary and this the secondary description: LSI MegaRAID SAS Driver author: megaraidlinux at lsi.com version: 00.00.06.12-rc1 > What does your device stack look like? It looks like the following Sata / SAS harddisks (we are running some raid sets with sas and some with sata) => LSI 9280 (raid 5/6) => drbd (on top of a guid partition or on top of a complete LD) => lvm2 => scst > DRBD does no request merging. That is interesting. I do not know if I interpret the iostat values correctly. I assumed that the tps shown for a device reflect how many i/os it has to handle and as drbd is handling much more tps on the primary as the backing disk is I assumed drbd would do the merging. > If coming from the page (or buffer) cache, IO is simply 4k. > (That's all normal file IO, unless using direct io). > Those are the requests that DRBD sees in its make_request function. > That's just the way it is. Yes, we were talking about file i/o here. I already found out that block i/o (or O_DIRECT) is issuing larger i/os and performing much better in my setup. We were using file i/o for our scsi target here because we wanted to use our ram for read caching. > It is the lower level device's IO scheduler ("elevator") that does the > aggregation/merging of these requests before submitting to the backend > "real" device. Mmmh okay... but why is it merging requests on the primary and not for the same (replicated) requests on the secondary. That sounds strage for me- I tried accessing the block device on the secondary directly too (without O_DIRECT) and I don't see any 4k i/o here. > > BUT lvm2 itself > > No, I don't think device mapper does such thing. > Well, "request based" device mapper targets will do that. > not sure if you actually use those, though. Do you? Mh I must admit I don't know if it operates request-based. It is just the standard lvm2 package from the debian squeeze repos. > But obviously *something* is different now, which allows the lower > level IO scheduler to merge things. Yes of course. SOMETHING is different. But I can't tell what from all my tests. As I said interestingly it does not matter which oft he servers if primary for a drbd device I tried swapping that and the primary does ALWAYS merge. The secondary NEVER - so I must assume that the megasas driver is capable of doing this on both nodes. (In my test setup primary does as well as secondary - cannot reproduce the problem here) Regards, Felix