[DRBD-user] Disk Write call on primary returning before committing on secondary while using Protocol C

Lars Ellenberg lars.ellenberg at linbit.com
Wed May 20 10:45:43 CEST 2009

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.

On Tue, May 19, 2009 at 11:55:26PM -0700, Omkumar Seshadri wrote:
> Hi,
>  I setup a drbd (8.2) cluster of two ubuntu servers using protocol C
> for maximum data reliability. However i observe that writes on the
> primary are not reflecting on the secondary instantly. It takes about
> 5 - 10 seconds for the data to replicate. While using protocol C, i
> expect the writes on primary to block till the data reaches and gets
> committed on the secondary server. I dumped a huge file on the drbd
> partition on the primary server and monitoring the network and disk
> activity on the secondary. I see that the data reaches the secondary
> only after few seconds delay. the servers are connected via 100mbs
> network and the sync speed is pretty good. I verified that the data is
> synced to the primary partition instantly but the data on secondary
> takes a while. Can you let me know if i am missing something obvious?

Why does this turn up again and again :(

You _do_ know about page cache?
and fsync?

from the write(2) man page
 A successful return from write() does not make any guarantee
 that data has been committed to disk. In fact, on some buggy
 implementations, it does not even guarantee that space has
 successfully been reserved for the data.
 The only way to be sure is to call fsync(2)
 after you are done writing all your data.

Additional note from me:
 If volatile write caches on the controller or disks are involved,
 it is likely that not even linux fsync is able to guarantee that
 your data is actually on "stable storage" (rotating rust),
 but only able to guarantee that the storage hardware has
 confirmed "receipt" of that data.

 So do not use volatile write cache.

 Disable them (yes, that does decrease performance; hey, its your data
 you are throwing down the sink).

 Or make them non-volatile, aka persistent:
 get a BBU (battery backup unit; yes, proper hardware has a price tag).

: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
please don't Cc me, but send to list   --   I'm subscribed

More information about the drbd-user mailing list