[DRBD-user] Strange replication using DRBD 8.3.2

Wed Apr 14 18:12:38 CEST 2010

On Wednesday 14 April 2010 13:46:38 loopx wrote:

> http://www.drbd.org/docs/introduction/ :
>     Protocol C. Synchronous replication protocol. Local write operations on
> the primary node are considered completed only after both the local and the
> remote disk write have been confirmed. As a result, loss of a single node
>  is guaranteed not to lead to any data loss. Data loss is, of course,
>  inevitable even with this replication protocol if both nodes (or their
>  storage subsystems) are irreversibly destroyed at the same time. By far,
>  the most commonly used replication protocol in DRBD setups is protocol C.
> -------------
> 
> But, when using 8.3.2 (on both side), and copy a 10Gb (9.6 in fact) file
> from LOCAL (/) to DRBD0 (mounted under /mnt/), I see that the local copy is
> finished, but the network transfert is not yet finished and didn't receive
> the complete file (see the amount of date received by using "iftop") ...

The solution is very simple: while drbd waits for the remote side to complete 
writing before signaling a write completed, cp itself does not wait! cp just 
issues the write request, but does not wait for it to complete. So the missing 
data is still in your local write cache waiting to be processed by drbd and 
your disks.

You can see the same with pure local writes, although it's harder to see 
because a local disk is writing quicker: cp is finished while your disk is 
still working. To be sure that the write is completed, issue a sync after your 
cp or use something a bit more sophisticated for your tests like dd as has 
been explained quite a few times on this list.

Regards,
Stefan Seifert