[DRBD-user] Initial sync

Cliff Hones cliff at aaisp.net.uk
Tue Feb 19 19:17:44 CET 2008

Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.


[Apologies if this issue has already been discussed.]

I have concerns over the initial sync time on large
(multi-terabyte) devices, and also on the performance
hit on application disk access while sync is in progress.
Here are my setup details:

I have drbd on two servers each having a 9TB hardware
RAID6.  I am using drbd 8.0.11 with CentOS5.

To provide flexibility, I used parted/gpt to partition the
raw 9TB array into 13 smaller areas of around 700GB each,
and I have six of these primary on one server and seven
primary on the other.  The servers have a GigaBit
private (crossover) link for the drbd traffic.

The initial synchronization is, not surprisingly, taking
a long time.  I have set the max sync rate to 100Mbyte/sec,
but the speed I am seeing is actually about 1/3 of this
(in both directions, as seen looking at ether stats),
so the whole process will take around two days.

I have not changed the default "net" parameters, nor
have I arranged for the syncs to happen sequentially
rather than in parallel.  I have also made no attempt
to match the block/region/partition sizes to the
underlying raid6, so I expect the speed I am seeing
can be improved somewhat.  It still seems unduly long,
and, more to the point, unnecessary.  I have seen the
drbd+ skip-initial-sync article, so I know I could
work around this using drbdmeta - but I'd much rather
see a cleaner mechanism available - either a built-in
way to avoid initial sync or preferably a way to mark
both disks as "zero" - ie blocks should read zero until
first written.  [I suspect that's not as easy as it
sounds as it would increase the metadata size?]

In addition to the sync speed, I am finding that disk
accesses are extremely slow while the sync is in progress.
For example, running mkfs.ext3 on a 1TB partition
is taking several hours to complete.  I'm guessing here -
I know the sync process is meant to be running at lower
priority - but is it possibly just filling the drbd IO
queue, so that any application IO has to wait for its
time through the queue?  Would reducing max-buffers help?
Would it be better if application requests could be
given higher priority - boosted to the front of the queue?

-- Cliff




More information about the drbd-user mailing list