[DRBD-user] Data consistency question

Wed Mar 14 16:38:28 CET 2018

Hi,

Thanks for the explanation.
So for example, let's have 2 nodes in a different geo locations (for say disaster recovery), so let's use protocol A so things go fast for the 1st node (the primary).
But we have  a large data to resync, say 10Tb and the link is slow so it might take few days for these 2 nodes to finish the initial resync.

But it will finish at some stage. Now say, I start some heavy file I/O operation on the 1st node and then suddenly the primary node fatally crashes:

1. You say no matter what I do, which filesystem I choose (xfs, ext4, btrfs), I will always be able to recover data on the 2nd node (the slave). At the maximum I would have to run fsck to fix journals, etc. Right?
2. The snapshotting via the "before-resync-target" handler will not be effective in this case as the crash happened after both nodes sync up. Right?
3. As we chose async protocol A, a decent RAM buffer is required for those writes which did not make it into the slave node yet. Is there some limit to this buffer so that when the limit is hit, a synchronous operation is enforced (i.e. much like the "vm.dirty_background_ratio" kernel parameter) ?
4. Is drbd able to merge writes - for example I write a to a file on node 1 & then immediately overwrite it. So the async protocol could potentially invalidate the 1st write as it was superseded by the later write to the same block.

5. Is it wise to run DRBD in the scenario above (Slow link, big chunk of data, asynchronous protocol, aiming for disaster recovery)? Yes I know I could rather use something like rsync, but we have lots of small files in the filesystem - it seems to me more practical to operate at the block level like DRBD does. 
Thanks,

Ondrej

-----Original Message-----
From: drbd-user-bounces at lists.linbit.com [mailto:drbd-user-bounces at lists.linbit.com] On Behalf Of Lars Ellenberg
Sent: Wednesday, March 14, 2018 2:03 PM
To: drbd-user at lists.linbit.com
Subject: Re: [DRBD-user] Data consistency question

On Tue, Mar 13, 2018 at 10:23:25AM +0000, Ondrej Valousek wrote:
> Hi list,
> 
> I have a question regarding filesystem consistency - say I choose 
> async protocol (A) and the master peer node crashes fatally in the 
> middle of write operation.
>
> The slave peer node will then be outdated, but what happens to the 
> filesystem on the top of the replicated block device - will I be able 
> to restore data from the outdated peer?
> 
> My understanding is that DRBD completely ignores the filesystem, so 
> unless I choose synchronous replication protocol C, filesystem 
> corruption can occur on the peer node.
> 
> Am I right?

No.

If you "fatally" lose your primary while the secondary is "sync target, inconsistent", then yes, you have lost your data.
That's why we have the "before-resync-target" handler, where you could snapshot the last consistent version of your data, before becoming sync target.

If you "fatally" lose your primary during normal operation (which is: live replication, no resync), depending on protocol in use, the disk on the secondary will possibly not have seen those writes that where still in flight.

Which in "synchronous" mode (protocol C) will be only requests that have not been completed to upper layers (the file system, the data base, the
VM) yet, so it would look just like a "single system crash".

In "asynchronous" mode (protocol A), that will be a few more requests, some of which may have already been completed to upper layers.

Clients that have committed a transaction, and already got an acknowledgement for that, may be confused by the fact that the most recent few such transactions may have been lost.

That's the nature of "asynchronous" replication here.

Going online with the Secondary now will look just like a "single system crash", but like that crash would have happened a few requests earlier.

It may miss the latest few updates.
But it will still be consistent.

--
: Lars Ellenberg
: LINBIT | Keeping the Digital World Running
: DRBD -- Heartbeat -- Corosync -- Pacemaker

DRBD(r) and LINBIT(r) are registered trademarks of LINBIT __ please don't Cc me, but send to list -- I'm subscribed _______________________________________________
drbd-user mailing list
drbd-user at lists.linbit.com
http://lists.linbit.com/mailman/listinfo/drbd-user
-----

The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s). Please direct any additional queries to: communications at s3group.com. Thank You. Silicon and Software Systems Limited (S3 Group). Registered in Ireland no. 378073. Registered Office: South County Business Park, Leopardstown, Dublin 18.