Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
This is a really interesting discussion. I use DRBD to replicate volumes that are exported via blockio with IET and have never gotten this message. I currently use DRBD 8.0.16 and over the last year with 8.0.x series this message has never appeared. Before deployment all sorts of io tests were conducted and this message wasn't present then either. So, I have an idea of a setting for you to change on your iSCSI target system and I'm really curious if the message goes away. In IET we have "InitialR2T Yes" but the default is "No". Trying running that way and see what happens as I'm very curious about the results. - Morey Roof Information Services Department New Mexico Institute of Mining and Technology -----Original Message----- From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg Sent: Tuesday, April 21, 2009 12:01 PM To: drbd-user at lists.linbit.com Subject: Re: [DRBD-user] Concurrent writes On Tue, Apr 21, 2009 at 10:02:33AM -0400, Gennadiy Nerubayev wrote: > > data divergence due to conflicting (overlapping) writes cannot > > happen when DRBD is not connected. > > so in this case DRBD does not care. > > > Gah. But wait, you mentioned in the first email that "submitting a new > write request overlapping with an in flight write request is bad > practice on any IO subsystem, as it may violate write ordering, and > the result in general is undefined". So why don't we care about it in > the standalone mode? Why can't it happen when DRBD is disconnected? of course it does happen. but the possible data divergence due to different reordering on lower layers cannot happen when we are not even replicating (disconnected). > And if it can, why doesn't it cause data corruption? it may, or may not. my assumption is that it _does_ cause data corruption once in a while, and no one ever notices. but while disconnected, it cannot not cause that sort of corruption DRBD cares about primarily: silent divergence of the replicas while DRBD thinks they should be identical. that is why in the disconnected case, we did not bother yet to check for this condition. this is easily rectifiedt though: we can enable this paranoia check also in disconnected mode, and voila, there are your kernel alerts again. DRBD cannot protect you from data corruption. if you write corrupt data to DRBD, or write data in a manner that may cause it to end up on disk "unexpected", because of re-ordering of requests on lower layers, drbd will happily replicate this corruption. which is by design: DRBD is agnostic to the content of the data it replicates. > I'm still trying to understand why this is not causing issues for so > many people that are running IET in blockio mode on standalone targets > (including those built on IET such as openfiler), yet when DRBD is > introduced, we run into this situation. only DRBD does check these things. only DRBD drops the latter, "conflicting" write. > Sorry if it seems like I'm trying to single out DRBD as the culprit, > but I can't quite grasp why this only appears to be a problem on DRBD > (paranoia checking for the condition aside), and that the problem is > big enough to discard writes. sure, we could work around the brokenness (as circumstantial evidence suggests) of the windows IO stack in DRBD. that is the beauty of open source. (feature sponsoring accepted). it could all be implemented differently. I just state how it is, and why we did it this way. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD(r) and LINBIT(r) are registered trademarks of LINBIT, Austria. __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________ drbd-user mailing list drbd-user at lists.linbit.com http://lists.linbit.com/mailman/listinfo/drbd-user