Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
> > > > I've tried all the other configuration tweaks I can think of, the only > > think of is that flushing is having an effect on the secondary node??? > > > > Can anyone clarify the situation for me? > > See if that helps to understand what we are doing, and why: > From: Lars Ellenberg > Subject: Re: massive latency increases from the slave with barrier or flush > enabled > Date: 2011-07-03 08:50:53 GMT > > http://article.gmane.org/gmane.linux.network.drbd/22056 > Hmmm... I googled for _ages_ and never came across that one! The behaviour you describe seems to contradict the documented behaviour of protocols A and B though. With flushes enabled, A and B act more like C as soon as a barrier/flush comes along (if I understand correctly). I can now understand why it is done this way and it seems obvious now, but putting something about it in the docs would be really useful, eg "In all protocols, a barrier/flush (if those options are enabled) will still cause data to be synced to disk before it is considered complete". That said, should I really expect my performance to be 10x worse? My setup is this: (1) iscisi initiator | | <- (a)multipatch across a pair of 1Gb links | (2) drbd primary | | <- (b)bonded pair of 1Gb links in rr mode | (3) drbd secondary If my dd write performance from (1) to (2) with (3) disconnected can be 0.5s, and (1) to (2) with (3) connected but with barrier/flush disabled can be about the same, why does it jump to 5s as soon as I turn on barrier and flush? This is protocol B so I assume that my link (b) is working just fine and it's the flush and barrier that slows things right down. One thing I haven't tried is different combinations of flush/md-flush/barrier... is that worth doing, or am I not really gaining any data integrity unless all are enabled? The problem is that I can't seem to do an adjust to change those without forcing a resync and/or crash of either node so I need to down the cluster first. Thanks again! James