Note: "permalinks" may not be as permanent as we would like,
direct links of old sources may well be a few messages off.
[Apologies if this issue has already been discussed.] I have concerns over the initial sync time on large (multi-terabyte) devices, and also on the performance hit on application disk access while sync is in progress. Here are my setup details: I have drbd on two servers each having a 9TB hardware RAID6. I am using drbd 8.0.11 with CentOS5. To provide flexibility, I used parted/gpt to partition the raw 9TB array into 13 smaller areas of around 700GB each, and I have six of these primary on one server and seven primary on the other. The servers have a GigaBit private (crossover) link for the drbd traffic. The initial synchronization is, not surprisingly, taking a long time. I have set the max sync rate to 100Mbyte/sec, but the speed I am seeing is actually about 1/3 of this (in both directions, as seen looking at ether stats), so the whole process will take around two days. I have not changed the default "net" parameters, nor have I arranged for the syncs to happen sequentially rather than in parallel. I have also made no attempt to match the block/region/partition sizes to the underlying raid6, so I expect the speed I am seeing can be improved somewhat. It still seems unduly long, and, more to the point, unnecessary. I have seen the drbd+ skip-initial-sync article, so I know I could work around this using drbdmeta - but I'd much rather see a cleaner mechanism available - either a built-in way to avoid initial sync or preferably a way to mark both disks as "zero" - ie blocks should read zero until first written. [I suspect that's not as easy as it sounds as it would increase the metadata size?] In addition to the sync speed, I am finding that disk accesses are extremely slow while the sync is in progress. For example, running mkfs.ext3 on a 1TB partition is taking several hours to complete. I'm guessing here - I know the sync process is meant to be running at lower priority - but is it possibly just filling the drbd IO queue, so that any application IO has to wait for its time through the queue? Would reducing max-buffers help? Would it be better if application requests could be given higher priority - boosted to the front of the queue? -- Cliff