Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
Dear Mr. Ellenberg, I thank you for the profound explanation of the issues of my settings. Ok. at the moment, the scope where I am using this configuration is either experimental, and I use it as a replacement of transferring date by "rsync" (with "rsync", I had to wait about 6h till my data has been synchronized, with drbd at about 20 min to 1h; so 'drbd is much better' anyway.). REPLACEMENT FOR "rsync": What I do on the local drbd-device is: * replacing some memory-mapped sectors (on the drbd-device) of some virtual-disk-files by "rsync" with the actual data from 'origin-virtual- disk-files' (writing ends after a finite time); * then 'on the peer drbd-device' I wait principally by "drbdsetup wait-sync-resource [...]" for achieving synchronisation with the local drbd-device; * then I take a 'dm-thin-provisioned-target'-copy of the 'base'-device ("/media/byUuid/EF1C0E32-3CB0-11DB-B6E3-0000C00A45A9.N1.V0.BASE") of the peer drbd-device (I do the waiting very thoroughly [but I do not want to bother you with details]); * then I start this copy as a standalone-drbd-device (with different minor) and so I have a valid copy of my virtual disks on 'the peer-computer'. So I think it does what I expect. Yes I noticed this quick state changes: what you mentioned with: ' Still, "constantly" cycling between Connected While not idle for long enough Ahead/Behind, SyncSource/SyncTarget is a bad idea. ' But, I thought software and cpus are not worn (ev. your nerves? [sorry]), so why bother? EXPERIMENTAL I have been concious, that ev. I would never reach a synchronized "UpToDate" state of 'my' drbd-devices. But anyway * As soon, as I am no longer expecting data-loss, I am going to put the virtual-disk-file of a running virtual-machine on the local drbd-device. * 'We' do not much writing to this virtual-disk-file (about 200M / day), so I expect, that ev. sometimes the data on the peer device would be consistent. * I just want to observe what happens by logging through appropriate scripts for "after-resync-target" and "before-resync-target". * I am anyway concious, that the virtual-disk-file of the running virtual machine is itself inconsistent during the virtual-disk-file is mounted 'as read/write-filesytem' in the virtual machine. * So I want to shutdown the virtual machine, wait for synchronization on the peer-device and then taking a copy as described above. ========================================================================== Because you advise me against using the drbd-device as I intended, I have to discuss it with my boss if we at all ...; so I allow myself, to cc. this mail to my boss (blind carbon copy), and attach "your response to my former letter" to this mail. I thank you once again for your immediate answer, sincerely Thomas Bruecker =========================================================================== On Wed Aug 16 12:10:48 CEST 2017 +? Lars Ellenberg <lars.ellenberg at linbit.com> wrote: [answer taken from the drbd-user mailing list; 'answer' --> '>' ] On Mon, Aug 14, 2017 at 10:09:06PM +0200, "Thomas Brücker" wrote: >> Dear DRBD-Developers, dear DRBD-Users, >> >> Actually I would be very fond of DRBD -- But unfortunately I had >> sometimes data-losses (rarely, but I had them). >> >> FOR DEVELOPERS AND USERS: >> >> DRBD-Versions concerned: 9.0.7-1, 9.0.8-1, 9.0.9rc1-1 . "THE VERSIONS" >> >> I think the following configuration options are mandatory to have these >> data losses: >> net { >> congestion-fill "1"; # 1 sector >> on-congestion "pull-ahead"; >> protocol "A"; >> [... (other options)] >> } >> (the goal of these settings: a very slow network-connection should not >> slow down the local disk-io.) > While that is a commendable goal, even without bugs, > this does not do what you apparently think it does. > "pull ahead" is an option that is really only useful > when using the DRBD proxy, the buffered ("in flight") data > will be several 100 MB to several GB, congestion-fill would be > ~ 80% (or more) of that buffer, and it would take seconds to minutes > to drain the already queued buffer before changing to resync > and then to normal replication. > Even then, the "pull ahead" is considered an emergency break only, > and certainly not something that is supposed to happen often. > Your configuration basically tells DRBD to "pull ahead" for *each* > write request, then "immediately" start a resync, while the next > write-request already jumps to "ahead" again. > Does not make sense, and probably DRBD should just refuse such > configuration. You are using it "out of spec", basically, > and it is very plausible that you hit some bugs when doing so. > That being said, even then DRBD should, once idle, eventually reach a > point where all replicas are identical again. > If you care for two-node scenarios only, > DRBD 8.4 may or may not behave better with pull-ahead, > but the comment above still applies, about "ahead" mode being intended, > and being only really useful, in conjunction with DRBD proxy. >> * Supposed Explanation: > Thank you. >> I am longing for a perfectly working DRBD, > Don't we all. > Still, it would not do what you apparently think it would. > "pulling ahead" means that we don't send the date over anymore, > but only the "LBA numbers" of changed blocks when they change first. > And that, once the "congestion" is considered to be over, > we start a resync. > Which means the peer becomes sync target. > If you pull ahead "very frequently", > you keep your peer between "behind" and "sync target", > it won't really have the chance to actually catch up. > A sync target is (necessarily, by design) Inconsistent. > Inconsistent means you have a mix of old and new blocks. > Inconsistent data is unusable. > If you "catastrophically" lose your main data copy, > and you are left with an only inconsistent remote copy, > because the peer constantly changed between "behind" and "sync target", > you still need to find your latest consistent backup. > DRBD has the "before resync-target" handler to at least try to > "snapshot" the latest consistent version of the data before becoming > inconsistent to mitigate that. > Still, "constantly" cycling between > Connected > While not idle for long enough > Ahead/Behind, > SyncSource/SyncTarget > is a bad idea. > If you want snapshot shipping, > use a system designed for snapshot shipping. > DRBD is not. > -- > : Lars Ellenberg > : LINBIT | Keeping the Digital World Running > : DRBD -- Heartbeat -- Corosync -- Pacemaker > DRBD® and LINBIT® are registered trademarks of LINBIT > __ > please don't Cc me, but send to list -- I'm subscribed